The Use of Regularization in the Context of Overfitting in Machine Learning

Ace Morton

Published Jun 20, 2024

Over the course of the past 2 months I have been at work at Atlas School learning about Machine Learning! I've recently taken some notes on regularization, specifically techniques used to combat Overfitting. They may be useful for your own Machine Learning journey!

Overfitting:

When the model works really, really well with it's current data set, but cannot work with new sets due to various issues, such as insufficient training data, irrelevant or misleading training data, training sessions that allow the machine to memorize patterns (hampering it's ability to learn new ones,) and the like. The topics as follows reflect measurements and strategies aimed at minimizing and mitigating the issue of overfitting.

One potential example that we may recognize is the prevalent issue with modern AI's inability to distinguish among minorities in it's implementation of facial recognition. This is likely due to the training data not containing enough faces of minorities, leading to false positives at an extended rate for said races. Unfortunately these programs are used for surveillance and security, and thus has led to many wrongful arrests.

L1 Regularization:

L1 regularization (AKA Lasso Regularization) encourages machines to set their coefficients to zero, by adding the Absolute Value of Magnitude to the loss function. The modified loss function is calculated with the formula below:

This limits the machine's processing of irrelevant and misleading data, and rather "zooms in" to specific parts of the data, allowing for feature selection.

L2 Regularization:

L2 regularization (AKA Ridge Regression) works by limiting the variance of the coefficients. This is useful for colinear and codependent features. It adds the Squared Magnitude as calculated below:

It encourages our machine to find a balance of all features, or in other words, a Libra.

Dropout:

Dropout is a method that involves ignoring various layers. Within a chosen layer, it will "turn off" randomly determined neurons at random epochs. To make up for the limit of various neurons, the remaining neurons' outputs are scaled in importance. This prevents neurons from becoming hyper-specific, allowing models to generalize unseen data. The effects could be considered similar to the concept of expanding neuroplasticity in humans, or perhaps increasing the ability to problem solve by randomly strengthening remaining neurons.

Data Augmentation:

Recommended by LinkedIn

Regularization in Machine Learning: The Key to…

Prasanna Biswas 1 year ago

Regularization in Machine Learning

RISHABH SINGH 1 year ago

Step-by-Step Guide to Machine Learning Classification…

Rajan Bhatt 1 year ago

Data Augmentation is the creation of new data sets from old ones, aka synthetic data sets. It's like re-using play-doh after you're done mixing different sets of colors together. This is done through various techniques on old training data, such as altering spatial properties, adjusting various shades and hues, and creating imperfections. This can be an affective process in that it increases the number of True Negatives within the output. This method was useful back in the day (the 90's) when datasets were limited. Some downsides are that QA is expensive but unfortunately necessary when working with input, as well as the fact that Research and Development in this process is also imperative.

Early Stopping:

Early Stopping is exactly what it sounds like. It detects when the output is deteriorating, by working with a validation dataset. Some ways that you can calculate when to stop is if you (or rather your program) may see an increase in False Negatives, if there is no change in a metric over a specific number of epochs, if there is an absolute change in a metric, a decrease in performance, or an average change in a metric. Using this process you will make more use of the training data used. This limits the amount of epochs necessary, and thusly limits the amount of time needed in order to reach a finished product.

This is my current understanding of the topics, I understand much of it may be surface level. I look forward to revisiting this topic in the future!

Sources:

https://explained.ai/regularization/L1vsL2.html#:~:text=From%20a%20practical%20standpoint%2C%20L1,you%20have%20collinear%2Fcodependent%20features.

https://innocenceproject.org/artificial-intelligence-is-putting-innocent-people-at-risk-of-being-incarcerated/

https://towardsdatascience.com/l1-and-l2-regularization-methods-ce25e7fc831c https://www.geeksforgeeks.org/dropout-regularization-in-deep-learning/

https://stats.stackexchange.com/questions/241645/how-to-explain-dropout-regularization-in-simple-terms

https://en.wikipedia.org/wiki/Data_augmentation

https://www.datacamp.com/tutorial/complete-guide-data-augmentation https://www.geeksforgeeks.org/regularization-by-early-stopping/

https://machinelearningmastery.com/early-stopping-to-avoid-overtraining-neural-network-models/

Nathan Rhys 1y

You make such a great point about the problem of unintentional bias in the datasets used for training facial recognition. The impact on minorities is particularly poignant - it really underscores the importance of diverse, representative datasets in building systems for good!

To view or add a comment, sign in

The Use of Regularization in the Context of Overfitting in Machine Learning

Ace Morton

Recommended by LinkedIn

More articles by Ace Morton

Others also viewed

Machine Learning Series

Hyperparameters in Machine Learning

A field guide to Agentic AI

Optimizing Model Performance with Hyperparameter Tuning: Best Practices

Standardization in machine learning

The Importance of Hyperparameter Tuning in Machine Learning

How Computers First Started Thinking: A Look into an Expert Systems

Dimensionality Reduction: Why Less is More in Machine Learning

No. 151: Overfitting and Underfitting—Practical Strategies for Improving Accuracy

Using Factor Analysis in Machine Learning

Regularization Methods in Machine Learning

Understanding Overfitting In Predictive Analytics

How to Optimize Machine Learning Performance

How To Fine-Tune AI Models On Small Datasets

Linear Regression Models

Explore content categories

Recommended by LinkedIn

More articles by Ace Morton

Building TradeIt - from conception to completion

Feature Scaling

Green Spaces in the Workplace

RealiTree

Others also viewed

Machine Learning Series

Hyperparameters in Machine Learning

A field guide to Agentic AI

Optimizing Model Performance with Hyperparameter Tuning: Best Practices

Standardization in machine learning

The Importance of Hyperparameter Tuning in Machine Learning

How Computers First Started Thinking: A Look into an Expert Systems

Dimensionality Reduction: Why Less is More in Machine Learning

No. 151: Overfitting and Underfitting—Practical Strategies for Improving Accuracy

Using Factor Analysis in Machine Learning

Similar topics

Regularization Methods in Machine Learning

Understanding Overfitting In Predictive Analytics

How to Optimize Machine Learning Performance

How To Fine-Tune AI Models On Small Datasets

Linear Regression Models

Explore content categories