Machine Learning
AI is the science of training machines to perform human tasks. This term was invented in 1950's when scientists started exploring on how computers can solve problems on their own. AI can be defined in simple terms as giving human like properties so that they can function like human brains to perform the tasks. The take our brains for granted on how they effortlessly calculate the world around us every second and every day. AI is the concept that a computer can do the same.
AI is the broad science of mimicking the human abilities. Machine learning is a specific subset of AI that trains a machine on how to learn. Machine learning model looks for patterns and data and try to draw conclusions like our eye would. Machines are not being explicitly programmed by people we can actually give them example and machines are going to learn what to do from those examples. That is a huge difference because its much easier for us humans to give examples than it is for us to write a code. Once the algorithm gets really good at drawing the right conclusions it applies that knowledge to new sets of data
That is the life cycle Ask the question, collect the data, train the algorithm, try it out, Collect the feedback, use that feedback to improve the algorithm
Real time example:
Google car, it has laser at the top which are telling it where it is in terms of its surrounding area. It also has radar in the front which is informing the car of the speed and motion of all the cars in surrounding and it uses all of the data to figure out how to drive the car and also to figure out and predict what potential drivers around the car are going to do. It’s almost a gigabyte of data that car is processing.
What's required to create good machine learning systems?
Data preparation capabilities.
Algorithms – basic and advanced.
Automation and iterative processes.
Scalability.
Ensemble modelling.
In machine learning, a target is called a label.
In statistics, a target is called a dependent variable.
A variable in statistics is called a feature in machine learning.
A transformation in statistics is called feature creation in machine learning.
Who's using it?
Financial services
Government
Health care
Retail
Oil and gas
Transportation
Types of machine learning problems
There are various ways to classify machine learning problems. Here, we discuss the most obvious ones.
1. On basis of the nature of the learning “signal” or “feedback” available to a learning system
Supervised learning: The computer is presented with example inputs and their desired outputs, given by a “teacher”, and the goal is to learn a general rule that maps inputs to outputs. The training process continues until the model achieves the desired level of accuracy on the training data. Some real-life examples are:
Image Classification: You train with images/labels. Then in the future you give a new image expecting that the computer will recognize the new object.
Market Prediction/Regression: You train the computer with historical market data and ask the computer to predict the new price in the future.
Unsupervised learning: No labels are given to the learning algorithm, leaving it on its own to find structure in its input. It is used for clustering population in different groups. Unsupervised learning can be a goal in itself (discovering hidden patterns in data).
Clustering: You ask the computer to separate similar data into clusters, this is essential in research and science.
High Dimension Visualization: Use the computer to help us visualize high dimension data.
Generative Models: After a model captures the probability distribution of your input data, it will be able to generate more data. This can be very useful to make your classifier more robust.
Semi-supervised learning: Problems where you have a large amount of input data and only some of the data is labelled, are called semi-supervised learning problems. These problems sit in between both supervised and unsupervised learning. For example, a photo archive where only some of the images are labelled, (e.g. dog, cat, person) and the majority are unlabeled.
Reinforcement learning: A computer program interacts with a dynamic environment in which it must perform a certain goal (such as driving a vehicle or playing a game against an opponent). The program is provided feedback in terms of rewards and punishments as it navigates its problem space.
2. On the basis of “output” desired from a machine learned system
Classification: Inputs are divided into two or more classes, and the learner must produce a model that assigns unseen inputs to one or more (multi-label classification) of these classes. This is typically tackled in a supervised way. Spam filtering is an example of classification, where the inputs are email (or other) messages, and the classes are “spam” and “not spam”.
Regression: It is also a supervised learning problem, but the outputs are continuous rather than discrete. For example, predicting the stock prices using historical data.
Clustering: Here, a set of inputs is to be divided into groups. Unlike in classification, the groups are not known beforehand, making this typically an unsupervised task.
As you can see in the example below, the given dataset points have been divided into groups identifiable by the colors red, green and blue.
Density estimation: The task is to find the distribution of inputs in some space.
Dimensionality reduction: It simplifies inputs by mapping them into a lower-dimensional space. Topic modelling is a related problem, where a program is given a list of human language documents and is tasked to find out which documents cover similar topics.
Good one Vaishnavi