Introduction to Machine Learning!
“The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.”
What is Machine Learning? Machine Learning is a subfield of computer science and artificial intelligence (AI) that focuses on the design of systems that can learn from and make decisions and predictions based on data. Machine learning enables computers to act and make data-driven decisions rather than being explicitly programmed to carry out a certain task.
Some examples :
- The heavily hyped, self-driving Google car? The essence of machine learning.
- Online recommendation offers like those from Amazon and Netflix? Machine learning applications for everyday life.
- Knowing what customers are saying about you on Twitter? Machine learning combined with linguistic rule creation.
- Fraud detection? One of the more obvious, important uses in our world today.
Why is it so prevalent these days? Resurge interest in machine learning is due to the factors like growing volumes and varieties of available data, computational processing that is cheaper and more powerful, and affordable data storage. All of these things mean it is possible to quickly and automatically produce models that can analyse bigger, more complex data and deliver faster, more accurate results – even on a very large scale. The result? High-value predictions that can guide better decisions and smart actions in real time without human intervention. Basically, Machine Learning solves problems that cannot be solved by numerical means alone.
"Humans can typically create one or two good models a week; machine learning can create thousands of models a week." - Thomas H. Davenport, an American academic and author specialising in analytics, business process innovation and knowledge management.
What are some popular machine learning methods? Two of the most widely adopted machine learning methods are Supervised learning and Unsupervised learning. Most machine learning – about 70 percent – is supervised learning. Unsupervised learning accounts for 10 to 20 percent. Semi-supervised and Reinforcement learning are two other technologies that are sometimes used.
- Supervised learning algorithms are trained using labeled examples, such as an input where the desired output is known. For example, a piece of equipment could have data points labeled either “F” (failed) or “R” (runs). The learning algorithm receives a set of inputs along with the corresponding correct outputs, and the algorithm learns by comparing its actual output with correct outputs to find errors. It then modifies the model accordingly. Through methods like classification, regression, prediction and gradient boosting, supervised learning uses patterns to predict the values of the label on additional unlabelled data. Supervised learning is commonly used in applications where historical data predicts likely future events. For example, it can anticipate when credit card transactions are likely to be fraudulent or which insurance customer is likely to file a claim.
- Unsupervised learning is used against data that has no historical labels. The system is not told the "right answer." The algorithm must figure out what is being shown. The goal is to explore the data and find some structure within. Unsupervised learning works well on transactional data. For example, it can identify segments of customers with similar attributes who can then be treated similarly in marketing campaigns. Or it can find the main attributes that separate customer segments from each other. Popular techniques include self-organising maps, nearest-neighbour mapping, k-means clustering and singular value decomposition. These algorithms are also used to segment text topics, recommend items and identify data outliers.
- Semi-supervised learning is used for the same applications as supervised learning. But it uses both labeled and unlabelled data for training – typically a small amount of labeled data with a large amount of unlabelled data (because unlabelled data is less expensive and takes less effort to acquire). This type of learning can be used with methods such as classification, regression and prediction. Semi-supervised learning is useful when the cost associated with labelling is too high to allow for a fully labeled training process. Early examples of this include identifying a person's face on a webcam.
- Reinforcement learning is often used for robotics, gaming and navigation. With reinforcement learning, the algorithm discovers through trial and error which actions yield the greatest rewards. This type of learning has three primary components: the agent (the learner or decision maker), the environment (everything the agent interacts with) and actions (what the agent can do). The objective is for the agent to choose actions that maximise the expected reward over a given amount of time. The agent will reach the goal much faster by following a good policy. So the goal in reinforcement learning is to learn the best policy.
List of some important ML algos are:
- SVM ( Support Vector Machine)
- Regression
- Classification and Decision Trees
- ANOVA
- Boosting
- Clustering
- Gradient Descent
- K-Nearest Neighbours
- Naive Bayes
- Random Forest
- Deep Learning
* Next post will contain the explanation of the above algorithms.
Thanks!