MLops : Session - 12
In my last article, I talk about Gradient Descent and Perceptron. These are some most important concepts of Deep learning which give you details about what happens behind the code.
Today I talk about some more terminology. I try to explain all the terms which we are using in our code. Let's check the below code which is for Simple Linear Regression Model using Deep learning. Here I use the Keras module to train my model. I use dataset which has 10000 rows.
This is the information about my dataset which I going to use in my model.
There are two important processes involved in the training of any neural network:
- Forward Propagation: There are certain parameter values that are randomly initialized. These parameters are doing some mathematical operations in the hidden layer, the result is sent to the output layer which generates the final prediction.
- Backward Propagation: Once the output is generated, the next step is to compare the output with the actual value. Based on the final output, and how close or far this is from the actual value (error), the values of the parameters are updated. The forward propagation process is repeated using the updated parameter values and new outputs are generated.
This is the base of any neural network algorithm.
We use mean_squared_error to get the loss or error.
We discuss optimizer in our last article. We have many algorithms that we used as an optimizer. We going to discuss these in our next session in detail. In the above code, we use Adam as an optimizer.
Learning Rate: Gradient descent has a parameter called the learning rate. Basically learning rate controls how much to change the model in response to the estimated error each time the model weights are updated. There are multiple ways to select a good starting point for the learning rate. A naive approach is to try a few different values and see which one gives you the best loss without sacrificing the speed of training.
Epochs: One Epoch is when an ENTIRE dataset is passed forward and backward through the neural network only ONCE. As the number of epochs increases, more number of times the weight are changed in the neural network. The number of epochs is selected from the condition that the error is satisfactory to you
Why we need multiple Epoch?
I found the best explanation for this question in one quora article.
When the data is too big, we can't pass all the data to the computer at once. So, to overcome this problem we need to divide the data into smaller sizes called batch size and give it to our computer one by one and update the weights of the neural networks at the end of every step to fit it to the data given.
We use some more terminologies like batch size, iterations, etc. We don't use these today in our code, we discussed these in our next sessions...
See you in next session...
Happy Learning :)