My experience in Machine Learning

Vijay M.

Published Jun 20, 2018

I have been thinking of trying ML- Machine Learning- for some, and finally, in last 2 weeks spent time to learn and run few experiments using AzureML and AWS Sagemaker.

In the entire process of ML - Data acquisition, Data Preparation, Modelling & Deployment - I realized that model training is a very small piece of the entire exercise. Most of the time and efforts are really needed in data acquisition and preparation. If the organization already has analytics practice setup then data acquisition and data preparation won’t be such a big trouble assuming that data assets, data lake and data views are already setup and domain experts to support analytics are available etc.

Based on my learning in running experiments here is what is important.

Data acquisition - How to make the data available for analysis? Identifying the data source and make it accessible to your machine learning platform. AzureML, AWS and Google are capable of consuming CSV using URLs, hence simply exposing your in-premise/cloud data using URLs is sufficient.

Data preparation - At this step, the key is to identify and prepare data fields which contribute to the problem being addressed - domain knowledge and data schema understanding is important here. This activity often needs knowledge in additional tools like excel, cloud tables, SQL or R language etc. I decided to use R, probably in near future I'll try Python also.

I started with some simple examples of regression(prediction), like house prices, and then tried my hands on predicting machine failure. For this one I faced some complexities to prepare data for modelling - had to learn data analysis techniques like correlations , cross- tabulations, handling missing and outliers etc. After building the model and testing the individual machine level failure prediction was accurate to 95+% for all the 5 algorithms I tried.

Other problem I tried was classification- classification of email as spam or not. Here I was able to hit accuracy of 80%+. One of the real world use cases I'll probably try is to identify the ticket support group-based on email contents and then automate the assignment.

I am yet to explore the other 2 types of data problems - Clustering & Anomaly detection. That's for next 2 weeks.

Between the two platforms like AWS and AzureML I find AzureML is easy to start with. Development visual and scripts are needed only when one wants to use R features for data preparation. AWS is more script oriented. Deployment of model as web service is easy on both the platforms. Of the 2 platforms my personal choice is AzureML just because it let's you achieve a lot without learning scripting languages like Python or R. But for long term it is beneficial to learn at least 1 scripting language.

Overall, it was a good skill addition and experience.

Habbie Rajan 7y

Good insight .. Thanks

See more comments

To view or add a comment, sign in

My experience in Machine Learning

Vijay M.

More articles by Vijay M.

Others also viewed

ML System Design: Designing Feature Stores & Ensuring Data Consistency in ML Systems

Harnessing the Power of Databricks: A Data Scientist's Perspective

Big announcements on Machine Learning from Microsoft’s Ignite conference

Machine Learning with Databricks

🤖 MLflow for Beginners: Managing Machine Learning Experiments on Databricks

Why I compare machine learning with relational databases

Why "Data Science?"

Process of Developing Scalable Machine Learning System

Tuning of Machine Learning Solution on Very Large Data Set (VLDS): Spark has it all defined.

Top 10 Machine Learning algorithm Cheatsheet

Tips for Machine Learning Success

Best Practices For Evaluating Predictive Analytics Models

Generalization in weather prediction models

Implementing Machine Learning in Project Analysis

Explore content categories

More articles by Vijay M.

Forget Data strategy for AI- Transform IT First

Unlocking AI's True Potential: It Takes More Than Just the Tech

AI is Hype? Now another one. Agentic AI?

Gen AI for All - Non tech and Tech folks.

Machine learning – for those who don’t have data science or ML background.

Embracing AI: Crucial for Software Pros and IT Companies

Student To A Technology Leader

Server-less architecture - Deliver more for less

User centric design-Web check-in experience