Nikos G.’s Post

5mo

Most data scientists don't know this, to my surprise. Bagging trees and random forests do not overfit when you increase the number of trees, but do underfit if the number is low. Adding extra trees just increases training time without any benefit after some point. #datascience #ai #python (image from ISLP, page 348)

18 Comments

Ronan Morris 5mo

If your only goal is accuracy then going past 200 trees rarely makes a difference in my opinion. But, secondary metrics, like feature importance (for example) benefit greatly from even going up to 2000 trees. They will stabilize better at that level, from run to run.

3 Reactions

Blaine Bateman, EAF 5mo

But they can and will overfit with too deep trees, small splits, etc. Always use an optimizer if you want a robust solution. I use Optuna. If you have enough data optimize on a cross validation inside the loop and confirm on a completely held out sample (test set).

8 Reactions

See more comments

To view or add a comment, sign in

More Relevant Posts

Pooja Mishra
5mo Edited
Report this post
👉 🚀 Hands-on Machine Learning Project: Linear Regression 🧠 Excited to share my latest project — Linear Regression Model built in Python (Jupyter Notebook)! 🎯 In this project, I explored how to predict house prices based on house size using one of the most fundamental algorithms in Machine Learning — Linear Regression. This project helped me understand: ✅ How the model finds the best-fit line ✅ The relationship between features and target variables ✅ How to visualize and interpret predictions 🔗 Check out my full project on GitHub: 👉 https://lnkd.in/dM6f7ik8 #MachineLearning #DataScience #Python #LinearRegression #GitHub #DataAnalytics #AI #LearningByDoing #WomenInTech #CareerGrowth
Like Comment
To view or add a comment, sign in
Haridas Kanure
6mo
Report this post
Transitioning from full-stack development to AI wasn’t as straightforward as I imagined — until I found the right tool to begin with. In my latest Medium article, I share how Jupyter Notebook became the perfect starting point for exploring Python, AI, and data science, even without prior AI experience. If you’re a developer looking to take your first step into AI, this might help you find your way too. 👉 Check it out: https://lnkd.in/d7sVhsaq #ArtificialIntelligence #Python #JupyterNotebook #DataScience #FullStackDeveloper #TheGeekPlanets
2 Comments
Like Comment
To view or add a comment, sign in
Ihor K.
6mo Edited
Report this post
Want to break into data or AI? Begin with Python. Harvard’s CS50P on YouTube covers the fundamentals with practical exercises - no prior experience needed. Playlist: https://lnkd.in/dq_MGW37 #Python #DataScience #AI #CS50 #Learning
Like Comment
To view or add a comment, sign in
Vishaal jr
6mo
Report this post
🚀 Built Multiple Linear Regression from Scratch using NumPy! Implemented the model both with and without Gradient Descent, without using any ML libraries — just pure math, NumPy, and logic 🧠💻 This project helped me deeply understand how linear regression works under the hood, from matrix operations to optimizing weights using gradient descent. #MachineLearning #LinearRegression #NumPy #Python #DataScience #FromScratch #AI
Like Comment
To view or add a comment, sign in
Swapna Nedarapalli
5mo
Report this post
A quick visual reference covering some of the most essential functions and classes in the scikit-learn library — from data preparation to model evaluation Each tool serves a specific role: 😎 These functions form the foundation of efficient, reliable, and reproducible ML workflows. #MachineLearning #DataScience #Python #ScikitLearn #AI #ModelEvaluation #Analytics
Like Comment
To view or add a comment, sign in
Alexander Perez
6mo
Report this post
🚀 As we look ahead to 2026, the landscape of data science will increasingly be dominated by Python. With its versatile libraries like Pandas, NumPy, and TensorFlow, Python will empower professionals to unlock deeper insights from their data than ever before. By 2026, we can expect a surge in machine learning applications within businesses, driven by Python’s community and advancements in artificial intelligence. Organizations that invest in training their teams in Python will have a competitive edge, as the demand for skilled data scientists continues to rise. 👉 Are you ready to embrace the Python revolution in data science? #DataScience #Python #MachineLearning #AI #FutureOfWork #TechTrends
Like Comment
To view or add a comment, sign in
Hamza Roohani
5mo
Report this post
A mini project about Supervised Learning, applied it by predicting house prices using the California Housing Dataset from Kaggle. Tools: Python, Pandas, Scikit-learn, Matplotlib Steps: Cleaned and visualized the dataset Trained a Linear Regression model Evaluated using mean squared error and r2 score Achieved an RMSE of 69,297.72 and visualized predictions vs actual prices. GitHub: https://lnkd.in/d8CkpV_b #MachineLearning #DataScience #Python #LearningJourney #AI
Like Comment
To view or add a comment, sign in
Mohini Ganjare
6mo
Report this post
⭐ Excited to share my Random Forest practical 🧠, I implemented this powerful ensemble algorithm using Python 🐍 (Scikit-learn). It was amazing to see how multiple Decision Trees work together through majority voting to improve accuracy, reduce overfitting, and balance bias-variance 🌿. Hands-on experiments like this make learning truly insightful, showing how ensemble methods turn raw data into reliable predictions 💡. Guided by Ashish Sawant Sir. 🔗 GitHub: https://lnkd.in/ez_NstrZ 📁 Google Drive: https://lnkd.in/ezXFx_py #RandomForest #MachineLearning #DataScience #AI #Python #EnsembleLearning #DataDriven #MLPracticals #LearningByDoing
Like Comment
To view or add a comment, sign in
Anurag Potdar
6mo
Report this post
Day 45 of #100DaysOfML Random Forest Implementation 🌲 Concept Recap: Random Forest is an ensemble of Decision Trees trained using bagging — each tree learns from a random subset of data and features. The final output is decided by majority voting (classification) or averaging (regression). It improves accuracy and reduces overfitting compared to a single Decision Tree #100DaysOfML #MachineLearning #RandomForest #DecisionTree #EnsembleLearning #DataScience #Python #MLAlgorithms #FeatureImportance #AI #MLProject #DataVisualization #LearnMachineLearning #MLJourney #TechLearning
Like Comment
To view or add a comment, sign in
TechVerce

104 followers
5mo
Report this post
⚔️ Python vs AI: Who’s Winning the Future? 🤖🐍 AI is stealing the spotlight everywhere — writing code, analyzing data, even building apps. So… is AI replacing Python? Not so fast. Python is still the language behind most AI systems. From TensorFlow to PyTorch, AI literally runs on Python’s back. It’s simple, readable, and has the largest developer army out there. AI may be smart — but Python taught it how to think. 🧠 Who wins the battle? Maybe AI creates the ideas… But Python still writes the rules. 💻🔥 #Python #AI #MachineLearning #DeepLearning #Programming #TechTrends
Like Comment
To view or add a comment, sign in

6,628 followers

View Profile Connect

Nikos G.’s Post

More from this author

Is Black Friday a scam?

A case on why keeping your distance might save you from a viral outbreak

An alternative way to predict the outcome of an election

Explore content categories