Mapping Machine Learning Algorithms to Use Cases
With such a variety of machine learning algorithms available and so many tool choices, it is understandable that those new to the field would ask the question “which algorithm should I use?” The answer to the question depends, of course, on what you want to do with the data, what is the business question that you want to answer. Also to be considered is the type of data you are working with, the size of the data and its quality. There are other factors too, but lets keep things simple for now.
I have linked in with some vendors for their mapping diagrams, or better known in the industry as Cheatsheets, and outlined some of those which I think are most helpful.
I will caveat this article by saying, that it is impossible for even the most experienced data scientist to know which algorithm will perform the best and she/he will usually try a number. However, it is true that different machine learning algorithms are suited to different types of data and different problems, and this article should provide you with some guidance on which algorithms to try first, depending on some clear factors.
The first mapping is from the Azure Machine Learning Algorithm Cheat Sheet and it does a really great job of creating that bridge between a business case and the appropriate algorithm. Microsoft say this mapping "helps you chose the best machine learning algorithm for your predictive analytics solution. Your decision is driven by both the nature of your data and the goal you want to achieve with your data"
Azure Machine Learning has a large library of algorithms from the classification, recommender systems, clustering, anomaly detection, regression, and text analytics families. Each is designed to address a different type of machine learning problem.
Next up is a flowchart from Scikit-learn designed 'to give users a bit of a rough guide on how to approach problems with regard to which estimators to try on your data'
https://scikit-learn.org/stable/tutorial/machine_learning_map/index.html
I have included the link here because this is actually an interactive chart so you can click on any of the machine learning algorithms in the green boxes to read their documentation which is really convenient.
Last but not least, I found a really great ML algorithm cheatsheet from SAS. They give good guidance on how to use it: Read the path and algorithm labels on the chart as "If <path label> then use <algorithm>."
For example:
- If you want to perform dimension reduction then use principal component analysis.
- If you need a numeric prediction quickly, use decision trees or linear regression.
- If you need a hierarchical result, use hierarchical clustering.
And a key point by SAS which really summarises all of these guides..
"it is important to remember these paths are intended to be rule-of-thumb recommendations, so ...the recommendations are not exact."
Hoping this article gets you started tinkering with Machine Learning, or, acts as a refresher/reminder for the seasoned pros.