MLops : Session - 20
Let's start from where we left yesterday... In the last article, I discussed about convolution layer of CNN architecture. We also look at what is Kernel and Strides in the Convolutional layer.
Let's discuss more CNN architecture:
Architecture of CNN:
There is an input image that we’re working with. We perform a series of convolution + pooling operations, followed by a number of fully connected layers. If we are performing multiclass classification the output is softmax. We will now dive into each component.
Convolve: Convolve is a mechanism that we used to detect the edge in our pic so that machine easily does Feature Extraction.
The main responsibility of the Convolutional layer is to do Feature Extraction. This is not responsible for minimize the number of features.
You can go to the last article for a detailed discussion of the Convolutional layer.
Pooling :
After doing the convolution operation, we usually perform pooling. Pooling layers would reduce the number of parameters when the images are too large, which both shortens the training time and combats overfitting.
Pooling can be of different types :
- Max Pooling
- Avg Pooling
- Min Pooling
The most common type of pooling is max pooling which just takes the max value in the pooling window. Let understand by below example:
In CNN architectures, pooling is typically performed with 2x2 windows, stride 2 and no padding. While convolution is done with 3x3 windows, stride 1 and with padding.
Fully Connected Layer :
After the convolution + pooling layers we add a couple of fully connected layers to wrap up the CNN architecture. This is the same fully connected ANN architecture
The ouput of both convoluton + pooling layer is in 3D but a fully connected layer expects a 1D vector of numbers. So we flatten the output of the final pooling layer to a vector and that becomes the input to the fully connected layer.
Flatten mean change 2D matrix in to 1D array. Flattening is simply arranging the 3D volume of numbers into a 1D vector. You can get a better understanding when we do some code...
Training:
Today I'm not taking any dataset. I only describe how we create our model. The training and prediction part we did in the next session. It is almost same as we do in NN.
In deep learning, first create a summary or rough sketch so that you easily get what you do...
- First thing which I'm going to import convolution layer, Maxpooling layer, Flatten layer, Dense layer and create a Sequential model.
- Now let's start add layers in our model one by one...
- Add a Convolution layer...
Input_shape is the shape of image. You can see that I'm not use padding here that's why output shape is reduced a little bit.
Kernel_size and filter are hyperparameters.
- Add a Maxpooling layer...
Maxpooling layer by default use (2 x 2) matrix. Just press shift + tab to check the detail of the function. You can see that number of pixels is just half after we add the maxpooling layer.
You can add multiple combinations of the Convolutional + pooling layer.
- Add a flatten layer so that we convert 2D matrix to vector.
In output shape of the flattening layer, you see that your 2D matrix converts to 1D array.
- After that, we do the same things as we did in Neural Network architecture. We are adding some Dense layers for computation. In the output layer, we are using Sigmoid activation function for Binary classification and SoftMax activation function for Multi-Classification.
This is a hyperparameter so that you can add any number of layer as you want.
- At last compile your model with optimizer and loss parameter.
model.compile(optimizer='adam',loss='binary_crossentropy')
NOW our model is ready to take input and do training.
In the next article, we are going to do see how to give input as an Image and detect Output.
Hope we meet again soon...
Great Work bro!!