A brief introduction of Random forest algorithm in Machine Learning

Ritu Ranjan Routray

Published Feb 25, 2019

Random forest is a supervised classification algorithm. The algorithm generates the forest with a number of trees (with random datasets). The data is selected randomly from input space and create multiple trees in the forest. In random forest classifier, the higher the number of trees in the forest, accuracy results are high.

Random Forests algorithm is based on a family of the decision tree. A decision tree denotes a classification or regression model in the form of a tree and each node in the tree denotes a feature from the input, each branch a decision and each leaf at the end of a branch the corresponding output value.

Random forest features:

The random forest classifier can use for both classification and the regression problems.
Random forest classifier will handle the missing values as well.
When we have more trees in the forest, the random forest classifier gives an accurate solution.
It can be used for categorical values and numerical features also.
Algorithm is very stable because if a new dataset is introduced, it is not affected much since new data may impact one tree only.

One important feature of Random forest is that it will fit for almost all of the ML problems.

Some example where the random forest algorithm is used:

Banking, Stock Market, E-commerce websites

There are a few disadvantages of Random forest as well:

It is slow if we generate a large number of trees.
It is complex and model is difficult to understand as compared with decision tree algorithm.
Python library "sklearn" provide methods and functions to apply Random Forest on input datasets.

To view or add a comment, sign in

A brief introduction of Random forest algorithm in Machine Learning

Ritu Ranjan Routray

More articles by Ritu Ranjan Routray

Explore content categories

More articles by Ritu Ranjan Routray

Introduction to Data Science

Machine learning vs Deep learning

Naive Bayes Classifier for text classification problems

Does the machine learning algorithm require retraining?

Data and Distributions

Inferential statistics & hypothesis testing

Explore content categories