How to optimize the model with Optuna?

How to optimize the model with Optuna?

Optuna is an automatic hyperparameter optimization software that is designed for machine learning. It has integrated with modules like Catboost, Keras, LightGBM, MXNet, PyTorch, TensorFlow, XGBoost and many more. The details about Optuna can be found here.

Let's implement the optimization of the model on dataset

We are using kidney stone data - that is classification analysis performed over data to predict whether a patient has a kidney stone or not. Data used in this analysis can be found on Kaggle.

First, import the required libraries:

import pandas as pd
import numpy as np
import optuna
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.inspection import permutation_importance

from xgboost import XGBClassifier

from sklearn.model_selection import RandomizedSearchCV, train_test_split, RepeatedStratifiedKFoldd        

Here, we are using cross-validation to optimize the model and have to understand the feature importance as well.

train_df = pd.read_csv('train.csv')
X = train_df.drop(['target'], axis=1)
y = train_df.target

X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.2, random_state=43)        

Now, Visualize the correlation of data with the heatmap.

corr = train_df.corr()
mask = np.triu(corr)
sns.heatmap(corr, mask=mask, annot=True, fmt='.3f')        
No alt text provided for this image
Correlation heatmap of data

Now, let's plot the data with a pair plot to see the distribution of data and understand the trend of the data.

sns.pairplot(data=train_df, hue='target', corner=True, 
             plot_kws={'s':80, 'edgecolor':'white','linewidth':2.5}, 
             palette='viridis')        
No alt text provided for this image

Now that we have got some ideas about the data trend from the above data let's try to optimize the hyperparameter.

optuna.logging.set_verbosity(optuna.logging.INFO)


def objective(trial):
    params={
        'verbosity': 0,
        'n_estimators': trial.suggest_int('n_estimators', 50, 1500),
        'learning_rate': trial.suggest_float('learning_rate', 1e-7, 1e-1),
        'max_depth': trial.suggest_int('max_depth', 3, 20),
        'colsample_bytree': trial.suggest_float('colsample_bytree', 0.1, 1.0),
        'alpha': trial.suggest_float('alpha', 1e-5, 1e2),
        'lambda': trial.suggest_float('lambda', 1e-5, 1e2),
        'objective': 'binary:logistic',
        'eval_metric': 'auc',
        'booster':trial.suggest_categorical("booster", ["dart", "gbtree",'gblinear']),
        'min_child_weight': trial.suggest_int('min_child_weight', 0, 5),
        'tree_method': 'gpu_hist'
    }
    
    kf = RepeatedStratifiedKFold(n_splits=10, n_repeats=2, random_state=42)
    
    scores = []
    for train_idx, test_idx in kf.split(X,y):
        X_train_fold, X_val_fold = X.iloc[train_idx], X.iloc[test_idx]
        y_train_fold, y_val_fold = y.iloc[train_idx], y.iloc[test_idx]
        
        xgb_model = XGBClassifier(**params)
        xgb_model.fit(X_train_fold, y_train_fold)
        
        y_pred = xgb_model.predict(X_val_fold)
        score = roc_auc_score(y_val_fold, y_pred)
        scores.append(score)
    return np.mean(scores)


study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=10, n_jobs=-1))
study.best_params        

From the above code, Optuna given followed optimized hyperparameter:

xgb_optuna_params = {'n_estimators': 521
 'learning_rate': 0.048058528487311035,
 'max_depth': 19,
 'colsample_bytree': 0.45072121994519687,
 'alpha': 25.06956546276981,
 'lambda': 12.722971177461535,
 'booster': 'gbtree',
 'min_child_weight': 4}        

Let's see the difference between the default and optimized parameter output.

Default Parameters

kf = RepeatedStratifiedKFold(n_splits = 10, random_state = 42, n_repeats = 10

default_param_scores = []

for train_idx, val_idx in kf.split(X, y):
    X_train, y_train = X.iloc[train_idx], y.iloc[train_idx]
    X_val, y_val = X.iloc[val_idx], y.iloc[val_idx]
    
    model = XGBClassifier().fit(X_train, y_train)
    
    y_pred = model.predict_proba(X_val)
    
    score = roc_auc_score(y_val, y_pred[:, 1])
    default_param_scores.append(score)
    
print(np.array(default_param_scores).mean())) # 0.7515        

Optimized parameter

kf = RepeatedStratifiedKFold(n_splits = 10, random_state = 42, n_repeats = 10

optimize_param_scores = []


for train_idx, val_idx in kf.split(X, y):
    X_train, y_train = X.iloc[train_idx], y.iloc[train_idx]
    X_val, y_val = X.iloc[val_idx], y.iloc[val_idx]
    
    model = XGBClassifier(**xgb_optuna_params).fit(X_train, y_train)
    
    y_pred = model.predict_proba(X_val)
    
    score = roc_auc_score(y_val, y_pred[:, 1])
    optimize_param_scores.append(score)
    
print(np.array(optimize_param_scores).mean())) # 0.7793        

We can clearly see the difference with a few steps and we can further optimize parameters by giving some more time to optimize the model parameters.

The full code will be available on GitHub.

Fill free to reach me at jivaniutsav007@gmail.com for any questions, concerns or suggestions!

I hope you will find it insightful!

References:

  1. https://optuna.org/
  2. https://www.kaggle.com/competitions/playground-series-s3e12

To view or add a comment, sign in

More articles by Utsav Jivani

  • Things to know about deploying a machine learning model

    Deploying a machine learning model is a very challenging task to enhance the usability of the model that customers can…

    2 Comments
  • Image Stitching

    What is Image Stitching? Image Stitching is the process of combining multiple photographic images with overlapping…

  • How to improve score with an ensemble learning model

    An ensemble learning method involves combining the predictions from multiple contributing models. Usually, regression…

  • Pulsar Star Classification

    Pulsars are a rare type of Neutro star that produces radio detectable here on Earth. They are of considerable…

  • Capstone Project - The battle of neighbourhood

    1. Introduction: Business Problem The purpose of this project is to help people getting the best neighbourhood who…

Others also viewed

Explore content categories