Decision Tree Regressor Simplifies with Limited Data

I expected a full decision tree… but the data had other plans. 🌳📊 While experimenting with machine learning on my dataset, I built a Decision Tree Regressor in Python to understand how different variables relate to the target variable TAD. Using Scikit-Learn, I split the data into training and testing sets, trained the model, evaluated its performance, and visualized the decision tree. 📊 Model Results • R² = 1.0 • RMSE = 0.0 At first, I expected the visualization to produce a large multi-branch decision tree. Instead, the output showed just a single node (a small box). 🔍 Why did this happen? The reason lies in the dataset structure: • The dataset is very small (limited observations) • The target variable does not vary enough across samples • The model could already perfectly predict the outcome without splitting the data further Because of this, the decision tree did not need to create additional branches. The optimal prediction was already achieved at the root node itself. In simple terms: the model realized there was nothing meaningful to split. This was a great reminder that machine learning models are only as complex as the data requires. Sometimes the most interesting insight is realizing why the model stayed simple. 🔍 Dr James Daniel Paul P Lovely Professional University (LPU) #Python #MachineLearning #DecisionTree #DataScience #BusinessAnalytics #LearningJourney 🚀

  • graphical user interface, text, application

To view or add a comment, sign in

Explore content categories