Intro to Machine Learning

Intro to Machine Learning

Machine Learning is the field of study that gives computer the ability to learn.

There are many types of ML systems, that is too useful to classify them in broad categories, based on the following criteria:

  • Trained or Not with human supervision: (Supervised, unsupervised, semi supervised and reinforcement learning).
  • Whether or not they can learn incrementally on the fly (online versus batch learning).
  • Whether they work by simply comparing new data points to known data points, or instead by detecting patterns in the training data and building a predictive model, much like scientists do (instance-based versus model-based learning).

In this article we will discuss only first type.

  1. Supervised:

In supervised learning, the training set you feed to the algorithm includes the desired

solutions, called labels.

No alt text provided for this image

A typical supervised learning task is classification:

The spam filter is a good example of this: it is trained with many example emails along with their class (spam or ham), and it must learn how to classify new emails.

Another typical task is to predict a target numeric value, such as the price of a car,

given a set of features (mileage, age, brand, etc.) called predictors.

This sort of task is called regression To train the system, you need to give it many examples

of cars, including both their predictors and their labels (i.e., their prices).

Note: that some regression algorithms can be used for classification as well, and vice

versa. For example, Logistic Regression is commonly used for classification.

Here are some of the most important supervised learning algorithms:

• Linear Regression.

• Logistic Regression.

• Support Vector Machines (SVMs).

• Decision Trees and Random Forests.

• Neural networks.

2. Unsupervised:

In unsupervised learning, as you might guess, the training data is unlabeled. The system tries to learn without a teacher.

For example, say you have a lot of data about your blog’s visitors. You may want to

run a clustering algorithm to try to detect groups of similar visitors.

At no point do you tell the algorithm which group a visitor belongs to: it finds those connections without your help.

For example, it might notice that 40% of your visitors are males who love comic books and generally read your blog in the evening, while 20% are young sci-fi lovers who visit during the weekends. If you use a hierarchical clustering algorithm, it may also subdivide each group into smaller groups. This may help you target your posts for each group.

No alt text provided for this image

3. Semi Supervised:

Since labeling data is usually time-consuming and costly, you will often have plenty of

unlabeled instances, and few labeled instances. Some algorithms can deal with data

that’s partially labeled.

Some photo-hosting services, such as Google Photos, are good examples of this. Once

you upload all your family photos to the service, it automatically recognizes that the

same person A shows up in photos 1, 5, and 11, while another person B shows up in

photos 2, 5, and 7. This is the unsupervised part of the algorithm (clustering). Now all

the system needs is for you to tell it who these people are. Just add one label per person

and it is able to name everyone in every photo, which is useful for searching photos.

4. Reinforcement Learning:

Reinforcement Learning is a very different beast. The learning system, called an agent

in this context, can observe the environment, select and perform actions, and get

rewards in return (or penalties in the form of negative rewards).

It must then learn by itself what is the best strategy, called a policy, to get the most reward over time.

A policy defines what action the agent should choose when it is in a given situation.


Ref: Hands on Machine Learning Skiti-learn, Keras and TensorFlow.

To view or add a comment, sign in

Others also viewed

Explore content categories