FACE-RECOGNITION USING TRANSFER LEARNING
Transfer learning is useful when we have insufficient data for a new domain we want to handle by a neural network and there is a big pre-existing data pool that can be transferred to our problem. So for example,we might have only 1,000 images of dogs, but by feeding into an existing CNN such as VGG-16, trained with more than 1 million images, we can gain a lot of low-level and mid-level feature definitions.We use pre-trained model and do fine-tuning some layers and freeze some layers.This saves a lot of time for training the Neural network and also the consumption power.
DATASET
Dataset is nothing but a collection of data.The dataset can be in any form. It may be of .jpg , .jpeg, .gif, .json...If we want to convert our dataset into our required kind we need to use a method called DATA WRANGLING.I have collected the dataset from Kaggle. The dataset is divided into train data and validation where in each case consists of five classes of Celebrity faces.
GOOGLE COLABORATORY
I have used Google Colaboratory for training the Neural network. I have used Google colabs because it provides a free GPU. You can check about the Google Colabs in the above link.
IMPORTING LIBRARIES
I have imported the VGG-16 network from the keras.applications. I have also imported the preprocess_input also so that it can be used Image Data Preprocessing. It also contains some functions that are used in building this Neural network. If you need to know more about preprocessing visit this site https://keras.io/api/preprocessing/image/ . I have also imported the ImageDataGenerator class for the Data Augmentation. If you need to know more about Data Augmentation visit this site https://machinelearningmastery.com/how-to-configure-image-data-augmentation-when-training-deep-learning-neural-networks/ . It is a beautiful blog by Jason Brownlee from Machine-Mastery. I have also imported Model class from keras.models. Basically Model groups layers into an object with training and inference features. To know more about Model class visit here https://keras.io/api/models/model/ .I have also imported Sequential class from keras.models. A Sequential model is used when there is a stack of plain layers , where each layer has exactly one input tensor and one output tensor. To know more about Sequential class visit here https://keras.io/guides/sequential_model/ . I have also imported some layers for creating this network. The layers include Conv2D which is a Convulution layer. I have also used Maxpooling2D which is a Maxpool layer. I have also used Dense layer and Activation layer which are Core layers. I have also used Dropout layer which is a Regulariztion Layer. I have also used Flatten layer which is a Reshaping layer. To know more about layers check this out https://keras.io/api/layers/ . I have also imported some basic packages such as numpy and pandas.
LOADING DATA AND PRE-PROCESSING
I have basically mentioned the image size to be 224 because default size of the image to give better accuracy in VGG-16 network is 224 . I have divided the dataset into train data and validation data. Train data consists of 91 images and Validation data consists of 25 images.Each folder consists of five classes of celebrity faces. I have used preprocessing function to preprocess the data.
We cannot pass keras.applications.vgg16.preprocess_input() directly to keras.preprocessing.image.ImageDataGenerator's 'preprocessing_function` argument because the former expects a 4D tensor whereas the latter expects a 3D tensor. Hence the existence of the wrapper.
LOADING WEIGHTS AND FREEZING BOTTLE NECK LAYER
I have loaded the weights using the VGG16 function from the imagenet. The weights are about 528 mb in size. To know more about VGG16() function refer https://keras.io/api/applications/vgg/#vgg16-function .ImageNet is an image database organized according to the WordNet hierarchy. To know more about Imagenet visit http://www.image-net.org/ . I have used Softmax function. Softmax converts a real vector to a vector of categorical probabilities. The elements of the output vector are in range (0, 1) and sum to 1. To know more about Activation function visit https://keras.io/api/layers/activations/ . I have freezed all the layers except the Bottleneck Layers for Fine-Tuning.
Fine-tuning is a multi-step process:
- Remove the fully connected nodes at the end of the network (i.e., where the actual class label predictions are made).
- Replace the fully connected nodes with freshly initialized ones.
- Freeze earlier CONV layers earlier in the network (ensuring that any previous robust features learned by the CNN are not destroyed).
- Start training, but only train the FC layer heads.
- Optionally unfreeze some/all of the CONV layers in the network and perform a second pass of training.
IMAGE DATA GENERATOR
Keras ImageDataGenerator class actually works by:
- Accepting a batch of images used for training.
- Taking this batch and applying a series of random transformations to each image in the batch (including random rotation, resizing, shearing, etc.).
- Replacing the original batch with the new, randomly transformed batch.
- Training the CNN on this randomly transformed batch (i.e., the original data itself is not used for training).
STEPS IN IMPLEMENTING IMAGEDATAGENERATOR CLASS
- Step 1: An input batch of images is presented to the ImageDataGenerator .
- Step 2: The ImageDataGenerator transforms each image in the batch by a series of random translations, rotations, etc.
- Step 3: The randomly transformed batch is then returned to the calling function.
COMPILING
I have used validation_generator.class_indices to check the class id of each class in the data. I have used sgd optimizer class. It is a Gradient descent (with momentum) optimizer. To know more about SGD check here https://keras.io/api/optimizers/sgd/ .
Here is the concept of finding a better learning rate.
The automatic learning rate finder algorithm works like this:
- Step1: We start by defining an upper and lower bound on our learning rate. The lower bound should be very small (1e-10) and the upper bound should be very large (1e+1).
- At 1e-10 the learning rate will be too small for our network to learn, while at 1e+1 the learning rate will be too large and our model will overfit.
- Both of these are okay, and in fact, that’s what we hope to see!
- Step 2: We then start training our network, starting at the lower bound.
- After each batch update, we exponentially increase our learning rate.
- We log the loss after each batch update as well.
- Step 3: Training continues, and therefore the learning rate continues to increase until we hit our maximum learning rate value.
- Typically, this entire training process/learning rate increase only takes 1-5 epochs.
- Step 4: After training is complete we plot a smoothed loss over time, enabling us to see when the learning rate is both:
- Just large enough for loss to decrease
- And too large, to the point where loss starts to increase.
The ModelCheckpoint is used to save the Keras model or model weights at some frequency.ModelCheckpoint callback is used in conjunction with training using model.fit() to save a model or weights (in a checkpoint file) at some interval, so the model or weights can be loaded later to continue the training from the state saved.
The EarlyStopping is used to stop training when a monitored metric has stopped improving.Assuming the goal of a training is to minimize the loss. With this, the metric to be monitored would be 'loss', and mode would be 'min'. A model.fit() training loop will check at end of every epoch whether the loss is no longer decreasing, considering the min_delta and patience if applicable. Once it's found no longer decreasing, model.stop_training is marked True and the training terminates.
Working of .fit_generator function
Internally, Keras is using the following process when training a model with
- .fit_generator :Keras calls the generator function supplied to .fit_generator (in this case, aug.flow ).
- The generator function yields a batch of size BS to the .fit_generator function.
- The .fit_generator function accepts the batch of data, performs backpropagation, and updates the weights in our model.
- This process is repeated until we have reached the desired number of epochs.
I have got an accuracy of around 99.98% on average and got a val_accuracy of around 84% on average though the dataset consists of images less than 100 images.
SUMMARY OF THE MODEL
I have used model.save_weights() function to save the best model.
GRAPH
This is the model accuracy fit that has been generated.
CONVERTING FINAL MODEL INTO JSON FORMAT
I have converted the best fit model that has been stored to get converted into json format so that it can be used to get the best results. I have used pyplot to get the actual output using the bar plot.
OUTPUT
I have used bar chart to visually determine the prediction of the model about the class corresponding to the celebrity face with estimated probability by the Neural Network.
LAYER BY LAYER VISUALIZATION OF THE NEURAL NETWORK
I have used a concept of feature map to visualize how the neural network of the output image is been trained layer by layer. To know more about Feature map check this out https://www.quora.com/What-is-meant-by-feature-maps-in-convolutional-neural-networks
Drive Link
References
Thank you for reading this article, feel free to ask any queries. If you liked it feel free to comment and share..
Google Drive share link is not found