Transfer Learning
Transfer Learning
Transfer learning is a machine learning method where you store knowledge gained from one task/model and apply it to different but related problem.
What did we do?
Decided to use EfficientNetV2S from the Keras Application API, freeze some layers and add my own to apply them to the CIFAR10 dataset. There was a bit of iterating done to find the best combination of dense layers, units, amount of layers frozen and the optimizer used.
What was the problem?
This week we were tasked to apply one of the already trained models in Keras' application API and tweak them a bit to apply it to the CIFAR 10 dataset and achieve a validation accuracy of at least 87%.
Recommended by LinkedIn
How was the problem solved?
Based on Andrew Ng's Deep Learning Specialization and Laurence Mooney's TF Specialization, I decided to start off with about 3/4 of the layers frozen and 3 Dense layers with a decreasing amount of units in each. From there I would try increasing and decreasing the number of frozen layers then trying each one with 2 or 3 dense layers and then switching the optimizer from Adam to RMSProp.
What did I find out?
It was quite the time consuming endeavor to do all these iterations and then realize it wasn't getting any better and sometimes outright worse, but ultimately a fun experience. Eventually found out that freezing less layers (about half), gave better results and keeping it simple with 3 Dense layers of 256, 128, and 64 units, with some Dropout layers (rate=0.5/0.4) to avoid overfitting, was the best combination. Surprisingly, the optimizer that worked best was actually RMSProp.
What did it all mean?
It was an excellent experience to see how transfer learning could be applied to similar tasks and to start to really see just how long these iteration cycles and hypertuning parameters can really take. In this case, less was more and ultimately the 87% val accuracy was achieved!