MLops : Session - 18
In today's article, I'm going to discuss Multilayer Perceptron. In my previous article, I already discuss Single Perceptron. You can check here.
Let's first see some more terminology which we used when we train our model.
Activation Function:
Activation function is one of the building blocks on Neural Network. You can think this as a formula which we give to neuron to learn and train the model. If we compare with our mind which is a combination of multi neurons, the activation function is at the end deciding what is to be fired to the next neuron.
We have some point by which we try to get a better understanding of activation function:
- It takes in the output signal from the previous cell and converts it into some form that can be taken as input to the next cell.
- As we see our linear function which we used is y = w*x +b is always given some value, if not restricted to a certain limit can go very high in magnitude especially in case of very deep neural networks that have millions of parameters. In classification models, we mostly have to need to find output in a range (0 or 1). So we need some activation function for doing this task such as Sigmoid function.
- We have many types of activation functions that we use on behalf of the output we want like sigmoid function, Relu function, Step function, tanh function and many more. You can get detail of each here.
Parameter and hyperparameter :
Basically Parameter are the ones that the Models used for prediction like weight and bias etc. Hyperparameter are the ones which we give to the machine for the learning process or we can say those parameters which we give initially to machine such as learning rate, the number and size of the hidden layers, etc.
Binary Classification:
In the last article of deep learning, we learn to do train a linear regression model. Here we take an example of binary classification...
1. Load Dataset
This a dataset of bank details, we predict that someone left the bank or not, by give their detail.
2. We have some categorical variables so first remove this...
So finally we get a dependent and independent variable.
3.
First we import sequential model. We use the same model for regression too.
4. Dense() function use to create hidden layers. Let's see...
I create 4 hidden layers. Units is the number of neuron in one layer.
In first three, We use Relu activation function. This function gives output as only positive side.
In 4th layer, we use sigmoid function because this is a problem of binary classification in which we need output as 0 or 1, Sigmoid function give output between 0 and 1.
4. Now the last part is to train our model.
I use Adam optimizer and 0.000001 learning rate. We did 200 epochs here and found that my loss is constantly decreases.
The number of hidden layers, learning rate, number of neuron, etc you give by check the loss again and again. There is no fixed way to get these terms.
We have two type of loss function in binary classification:
- Binary CrossEntropy
- Categorical CrossEntropy.
We talk about both in detail in our next sessions. I prefer to go once last some article to understand many terminologies which I not describe here like forward and backward propagation, learning rate, concept of epoch, etc.
I prefer you to go some last articles for better understanding.
Hope we meet again...
Happy Learning :)