XGBoost Vs LightGBM

Ashik Kumar

Published May 2, 2022

XGBOOST Algorithm:

A very popular and in-demand algorithm often referred to as the winning algorithm for various competitions on different platforms. XGBOOST stands for Extreme Gradient Boosting. This algorithm is an improved version of the Gradient Boosting Algorithm. The base algorithm is Gradient Boosting Decision Tree Algorithm. Its powerful predictive power and easy to implement approach has made it float throughout many machine learning notebooks. Some key points of the algorithm are as follows:

It does not build the full tree structure but builds it greedily.
As compared to LightGBM it splits level-wise rather than leaf-wise.
In Gradient Boosting, negative gradients are taken into account to optimize the loss function but here Taylor’s expansion is taken into account.
The regularization term penalizes from building complex tree models.

Some parameters which can be tuned to increase the performance are as follows:

General Parameters include the following:

booster: It has 2 options — gbtree and gblinear.
silent: If kept to 1 no running messages will be shown while the code is executing.
nthread: Mainly used for parallel processing. The number of cores is specified here.

Booster Parameters include the following:

eta: Makes model robust by shrinkage of weights at each step.
max_depth: Should be set accordingly to avoid overfitting.
max_leaf_nodes: If this parameter is defined then the model will ignore max_depth.
gamma: Specifies the minimum loss reduction which is required to make a split.
lambda: L2 regularization term on the weights.

Learning Task Parameters include the following:

1) objective: This will define the loss function which is to be used.

Light Gradient Boosting Machine:

LGBM is a quick, distributed, and high-performance gradient lifting framework which is based upon a popular machine learning algorithm – Decision Tree. It can be used in classification, regression, and many more machine learning tasks. This algorithm grows leaf wise and chooses the maximum delta value to grow. LightGBM uses histogram-based algorithms. The advantages of this are as follows:

Less Memory Usage
Reduction in Communication Cost for parallel learning
Reduction in Cost for calculating gain for each split in the decision tree.

So as LightGBM gets trained much faster but also it can lead to the case of overfitting sometimes. So, let us see what parameters can be tuned to get a better optimal model.

To get the best fit following parameters must be tuned:

num_leaves: Since LightGBM grows leaf-wise this value must be less than 2^(max_depth) to avoid an overfitting scenario.
min_data_in_leaf: For large datasets, its value should be set in hundreds to thousands.
max_depth: A key parameter whose value should be set accordingly to avoid overfitting.

For Achieving Better Accuracy following parameters must be tuned:

More Training Data Added to the Model can increase accuracy. (can be also external unseen data)
num_leaves: Increasing its value will increase accuracy as the splitting is taking leaf-wise but overfitting also may occur.
max_bin: High value will have a major impact on accuracy but will eventually go to overfitting.

To view or add a comment, sign in

XGBoost Vs LightGBM

Ashik Kumar

XGBOOST Algorithm:

Recommended by LinkedIn

Light Gradient Boosting Machine:

More articles by Ashik Kumar

Others also viewed

Great Predictions, Poor Descriptions: Actionable Insights for SHAP Boosted by Tree-Based Models

Gradient Descent: Optimizer Behind Machine Learning

An Executive’s View: Overview of major Machine Learning Algorithms

🚀The Mystery of the “Dropped Category” in Categorical Encoding – Explained..

Day 15 — XGBoost

Selecting Your Indicators With Machine Learning

📊 Dataset Size and Model Complexity Are Connected in Machine Learning

SVM (Support Vector Machine)

Understanding C 4.5 Algorithm: Classification Model with amazing accuracy and advantages

Decision Tree Models

Boosting Methods in Machine Learning

Logistic Regression Techniques

Linear Regression Models

Gradient Descent Variants

How to Optimize Machine Learning Performance

Understanding Overfitting In Predictive Analytics

Explore content categories

XGBOOST Algorithm:

Recommended by LinkedIn

Light Gradient Boosting Machine:

More articles by Ashik Kumar

Transforming Agriculture with AI & ML: The Future of Smart Farming

🚀 Unlock the Power of NL2SQL with LangChain 🚀

What generative AI can create?

Harnessing AI for a Greener Future: Deep Learning for Sustainability

Full Stack Data Science Program with 100% placement guarantee.

🔍 Exciting News for NLP Enthusiasts! 🌟

Stable Diffusion Model

30+ Solved Python Projects

Image Finder

Supermarket-Data-Analysis

Others also viewed

Great Predictions, Poor Descriptions: Actionable Insights for SHAP Boosted by Tree-Based Models

Gradient Descent: Optimizer Behind Machine Learning

An Executive’s View: Overview of major Machine Learning Algorithms

🚀The Mystery of the “Dropped Category” in Categorical Encoding – Explained..

Day 15 — XGBoost

Selecting Your Indicators With Machine Learning

📊 Dataset Size and Model Complexity Are Connected in Machine Learning

SVM (Support Vector Machine)

Understanding C 4.5 Algorithm: Classification Model with amazing accuracy and advantages

Similar topics

Decision Tree Models

Boosting Methods in Machine Learning

Logistic Regression Techniques

Linear Regression Models

Gradient Descent Variants

How to Optimize Machine Learning Performance

Understanding Overfitting In Predictive Analytics

Explore content categories