Avoid Overfitting with Regularization

Ahmed Gad

Published Jan 15, 2018

Have you ever created a machine learning model that is perfect for the training samples but gives very bad predictions with unseen samples! Did you ever think why this happens? This article explains overfitting which is one of the reasons for poor predictions for unseen samples. Also, regularization technique based on regression is presented by simple steps to make it clear how to avoid overfitting.

The focus of machine learning (ML) is to train an algorithm with training data in order create a model that is able to make the correct predictions for unseen data (test data). To create a classifier, for example, a human expert will start by collecting the data required to train the ML algorithm. The human is responsible for finding the best types of features to represent each class which is capable of discriminating between the different classes. Such features will be used to train the ML algorithm. Suppose we are to build a ML model that classifies images as containing cats or not using the following training data.

The first question we have to answer is “what are the best features to use?”. This is a critical question in ML as the better the used features the better the predictions the trained ML model makes and vice versa. Let us try to visualize such images and extract some features that are representative of cats. Some of the representative features may be the existence of two dark eye pupils and two ears with a diagonal direction. Assuming that we extracted such features, somehow, from the above training images and a trained ML model is created. Such model can work with a wide range of cat images because the used features are existing in most of the cats. We can test the model using some unseen data as the following. Assuming that the classification accuracy of the test data is x%.

One may want to increase the classification accuracy. The first thing to think of is by using more features than the two ones used previously. This is because the more discriminative features to use, the better the accuracy. By inspecting the training data again, we can find more features such as the overall image color as all training cat samples are white and the eye irises color as the training data has a yellow iris color. The feature vector will have the 4 features shown below. They will be used to retrain the ML model.

After creating the trained model next is to test it. The expected result after using the new feature vector is that the classification accuracy will decrease to be less than x%. But why? The cause of accuracy drop is using some features that are already existing in the training data but not existing generally in all cat images. The features are not general across all cat images. All used training images have a while image color and a yellow eye irises but they are generalized to all cats. In the testing data, some cats have a black or yellow color which is not white as used in training. Some cats have not the irises color yellow.

In the testing data, some cats have a black or yellow color which is not white as used in training. Some cats have not the irises color yellow.

Our case in which the used features are powerful for the training samples but very poor for the testing samples is known as overfitting. The model is trained with some features that are exclusive to the training data but not existing in the testing data.

The goal of the previous discussion is to make the idea of overfitting simple by a high-level example. To get into the details it is preferable to work with a simpler example. That is why the rest of the discussion will be based on a regression example.

Rajesh Kandala, Ph.D. 8y

Very much useful. The entire article has good flow and readability. Thank you.

2 Reactions

See more comments

To view or add a comment, sign in

Avoid Overfitting with Regularization

Ahmed Gad

More articles by Ahmed Gad

Others also viewed

All About The Difference Between Overfitting And Underfitting In Machine Learning

Understanding Classifiers in Machine Learning

The machine learning workflow

Machine Learning Algorithms - Part 1- Introduction

Gradient Descent: The Essential Guide to Machine Learning Optimization

Evaluation Metrics for Machine Learning

What's the Fuss About Machine Learning?

Bias Vs Variance in Machine Learning

Re-sampling Techniques

Resampling and Over-sampling in Imbalanced Machine Learning

Understanding Overfitting In Predictive Analytics

Overcoming Data Limitations In AI Model Development

How to Optimize Machine Learning Performance

The Role Of Feature Engineering In Predictive Analytics

Explore content categories

More articles by Ahmed Gad

A Guide to Preparing OpenCV for Android

Reproducing Images using a Genetic Algorithm with Python

From Y=X to Building a Complete Artificial Neural Network

Feature Reduction using Genetic Algorithm with Python

Artificial Neural Networks Optimization using Genetic Algorithm with Python

Artificial Neural Network Implementation using NumPy and Classification of the Fruits360 Image Dataset

Supporting Arabic Alphabet in Kivy for Building Cross-Platform Applications

Building Vision-Controlled Car using Raspberry Pi

Building Surveillance System using USB Camera and Wireless-Connected Raspberry Pi

Building an Image Classifier Running on Raspberry Pi