How Random Forests Predict Classes — and Continuous Values

WEI CHE Hung

Published Dec 22, 2025

Random Forests are often introduced as classification models: each tree votes for a class, and the forest returns the majority decision. This is correct, but it only describes half of what the method can do. Random Forests are also capable of predicting continuous values with the same underlying structure. The only change is how each tree produces its final output.

Below is a clear breakdown of both modes.

1. Random Forest for Classification

In the classification setting, each decision tree is trained to split the data in a way that increases class purity. Typical split criteria include:

Gini impurity
Entropy

Each tree ends with several leaf nodes, and each leaf contains a distribution of class labels from the training samples that reached that region.

A tree’s prediction is simply the most common class inside its leaf.

For the full forest, the final output is decided through:

Majority vote across all tree predictions.

This makes Random Forests stable and robust for classification tasks.

2. Random Forest for Continuous-Value Prediction (Regression)

The structure of the trees does not change. The forest still creates:

Bootstrap samples
Random feature subsets at each split
Independent decision trees

The only difference lies in what each leaf stores and how predictions are combined.

Inside each leaf, instead of storing class counts, the tree stores a numerical value:

The mean of all target values (y) inside that leaf.

For any new input, the tree follows its splits until it arrives at a leaf, and then outputs that leaf’s mean value.

When combining predictions, the forest uses:

Average of all tree outputs.

This simple “replace vote with mean” strategy turns the Random Forest into a regression model capable of predicting any continuous metric — from temperature to price to medical measurements.

The splitting rule also changes: instead of Gini or entropy, the tree optimizes reduction of variance (SSE) inside each node.

3. Summary

Random Forests use the same machinery for both tasks:

Same tree structure
Same bootstrap sampling
Same random feature selection
Same recursive splitting procedure

The only change is the leaf output:

Classification: most frequent class
Regression: mean of the numerical target values

And the only change in the ensemble rule:

Classification: majority vote
Regression: average of predictions

This unified view explains why Random Forests are highly flexible and easy to extend to many real-world problems.

4. Coming Next

Gradient Boosting methods such as Gradient Boosting Machines, XGBoost, LightGBM and CatBoost follow the same dual nature — they can perform both classification and continuous-value prediction. In a future article, we will explore how boosting uses gradients instead of votes or means, and why this often achieves higher accuracy on structured datasets.

#MachineLearning #RandomForest #DataScience #MLAlgorithms

To view or add a comment, sign in

How Random Forests Predict Classes — and Continuous Values

WEI CHE Hung

1. Random Forest for Classification

2. Random Forest for Continuous-Value Prediction (Regression)

3. Summary

4. Coming Next

More articles by WEI CHE Hung

Explore content categories

1. Random Forest for Classification

2. Random Forest for Continuous-Value Prediction (Regression)

3. Summary

4. Coming Next

More articles by WEI CHE Hung

How Claude and I Built a Local AI Agent on Two Consumer GPUs

Running a Local LLM — What I Learned Before Buying Hardware

Stop Paying for Fancy AI Features — Build Your Own Paper Intelligence Pipeline

Do You Know How OpenClaw Actually Runs Your Terminal Commands?

Running OpenClaw 🦞 with Your Own Local Model

Understanding PyTorch nn.Module: Why nn.Linear Has Two Arguments in __init__, but Only One in forward

Do You Know LoRA? Understanding Its Power — and Its Limitations

Do You Know Google’s Latest “Nested Learning” Paper?

Why We Need Layer Normalization After the Z Stage — Understanding the Classical Transformer

From CNN to Vision Transformers: A Modern Approach for Medical Short-Video Classification

Explore content categories

Understanding PyTorch nn.Module: Why nn.Linear Has Two Arguments in init, but Only One in forward