Relu

Kumar Dahal

Published Sep 3, 2023

ReLU (Rectified Linear Unit) is an regression activation function commonly used in neural networks to add non linearity properties in our network.

It addresses some of the issues associated with sigmoid and tanh activation functions.The problem associated with sigmoid was, it was not zero centric, due to which it fluctuates during back propagation, and takes long time in convergence which could consume high resources.

Another problem was because of derivative of sigmoid which ranges from (0, 0.25).During back-propagation the updated weight will be quite similar to new weight eventually leading to vanishing gradient problem.

Tanh is zero centric but face same problem as sigmoid, vanishing gradient.

Relu is zero centric (0,y). It’s derivative lies between (0,1) because any negative values are changed to 0 and non negative value to 1, solving the problem of vanishing gradient.

But relu comes with the problem of dead neuron. There could be scanario when Relu(z) = 0.

This happens, if the input to a Relu neuron is negative, leading to zero gradients during backpropagation, preventing weight updates.

To solve this issue we have varient of Relu called Leaky relu which has a small, non-zero slope for negative inputs.

if x > 0:

relu = x

else:

relu = alpha*x

alpha is a small positive constant (typically very close to zero, e.g., 0.01)

To view or add a comment, sign in

More articles by Kumar Dahal

Anti-Money Laundering (AML) Detection System!

Apr 21, 2025

Anti-Money Laundering (AML) Detection System!

Over the past few weeks, I’ve been working on a full-fledged end-to-end machine learning pipeline that simulates how…

1 Comment
Automating Model Selection Pipeline

Oct 10, 2024

Automating Model Selection Pipeline

As data scientists, we all know the struggle of spending hours in Jupyter Notebooks, experimenting with multiple…

2 Comments
🚀 Proud to Share My Recent Project: Credit Risk Status Classifier with Full ETL, PySpark Integration, and RAG Analyzer 🚀

Oct 3, 2024

🚀 Proud to Share My Recent Project: Credit Risk Status Classifier with Full ETL, PySpark Integration, and RAG Analyzer 🚀

I recently completed a comprehensive credit risk prediction project that includes a fully automated ETL (Extract…

10 Comments
Generative Adversarial Network for image reconstruction

May 2, 2024

Generative Adversarial Network for image reconstruction

I'm excited to share a recent project in optimizing a complex Generative Adversarial Network Model for image…

4 Comments
Customer complaint prediction

Feb 11, 2024

Customer complaint prediction

Finance complaint prediction A customer complaint is a formal or informal expression of dissatisfaction or displeasure…
Transformer

Sep 21, 2023

Transformer

Long short term memory (LSTM) takes input serially, which can’t utilize GPU parallely, that’s why it is slower. LSTM To…

See all articles

More articles by Kumar Dahal

Anti-Money Laundering (AML) Detection System!

Automating Model Selection Pipeline

🚀 Proud to Share My Recent Project: Credit Risk Status Classifier with Full ETL, PySpark Integration, and RAG Analyzer 🚀

Generative Adversarial Network for image reconstruction

Customer complaint prediction

Transformer

Explore content categories