Machine Learning Automated
According a report on the internet, 60% of the Machine Learning projects are never completed. Ever wondered why ?
Machine Learning projects require a lot tweaks over the time of training and testing. Here is a simulation I have tried to automate my Machine Learning .
Also , I know that this simulation will hardly be used in real life but this is just a project and an inspiration for Machine Leaning enthusiasts like me.
About the project:
I have used technologies like, Git, Github, Jenkins, Docker, Python, Bash to execute this challenge.
I will be training a Deep Learning project and check it's accuracy. And if accuracy is less than 85% (say) , then another python program will make changes to the code, by adding few CRP ( Convolutional and MaxPooling) layers to it.
We begin by creating docker image which will start training the model as soon as it is executed.
The Dockerfile goes as follows,
FROM centos:7 RUN yum install python3 -y RUN yum install libXext -y RUN yum install libXrender -y RUN yum install libSM -y RUN pip3 install -U pip RUN pip install keras RUN pip install numpy RUN pip install tensorflow RUN pip install pillow RUN pip install opencv-python RUN pip install sklearn RUN pip install pandas CMD python3 /root/CNN.py
This docker file will install the above python packages and start training the model as soon as it is launched.
docker build -t pyos:v4 .
The above snippet will create a docker image, named pyos:v4 (this is what I named it)
Let's start by creating our first Jenkins job (JOB1 --> github_pull)
This job will pull the CNN code from github and copy it to the folder.
Moving on to the next job (JOB2 --> container_deploy)
The above job will run only if the first job is successful, and this job will execute the bash file. And the bash file is as follows.
#!/bin/bash if sudo cat /root/cats\ and\ dogs/CNN.py | grep keras then if sudo docker ps -a | grep neural then sudo docker rm -f neural sudo docker run -v /root/cats\ and\ dogs:/root --name neural pyos:v4 else sudo docker run -v /root/cats\ and\ dogs:/root --name neural pyos:v4 fi elif sudo cat /root/cats\ and\ dogs/CNN.py | grep sklearn then if sudo docker ps -a | grep lr then sudo docker rm -f lr sudo docker run -v /root/cats\ and\ dogs:/root --name lr pyos:v6 else sudo docker run -v /root/cats\ and\ dogs:/root --name lr pyos:v6 fi else echo "Not a ML code" fi
The job will launch the container respective to the type of code (deep learning or logistic regression)
Moving on the the next job (JOB3 --> accuracy_check)
This job will only execute if the job2 was executed successfully, the fun parts begin here.
This job will run a python code, and as the name suggests it will check the accuracy, the python code is as follows.
import os
f = open("/root/cats and dogs/output.txt", "r")
acc = int(f.read(3))
f.close()
if(acc <= 85):
os.system("python3 /root/cats\ and\ dogs/tweak.py")
else:
print("Accuracy is good enough")
This code will read the output.txt file which has accuracy in the first line from our CNN code, and if the accuracy is less than 85% , it will change the code (by running tweak.py script) or else it will just print that the accuracy is good enough. This condition can be changed according to the need and project.
The tweak.py script goes as follows,
import fileinput file_name = '/root/workshop-ml/CNN.py' for line in fileinput.FileInput(file_name, inplace = 1): if "model.add(MaxPooling2D(pool_size = (2, 2))" in line: line = line.replace(line, line + '''model.add(Convolution2D(filters = 64, kernel_size = (3, 3), activation = 'relu')) model.add(MaxPooling2D(pool_size = (2, 2)) ''' + '\n') print(line)
It uses the basic file handling concepts of python it will add another CRP layer after the first CRP layer.
Remember this project is just a simulation of automating the Machine Learning code, the changes can be made by the developer according to the need .
Moving to the next job (JOB4 --> tweak)
This job will only execute only if the previous job was successful.
This bash shell uses the basic concept of bash scripting.
The first line gets the last line from the output.txt and stores it in a variable, and the second line converts that variable into an integer, and then the if condition is straight forward.
If the accuracy is greater than 85 then the job will be executed successfully or else the changes code will be uploaded to the github and then the next cycle will begin from the job1.
We can also make a job5 if which will monitor our docker containers continuously, and if any of them fails it will rebuild the job2.
from keras.layers import Convolution2D, MaxPooling2D, Dense, Flatten
from keras.models import Sequential
from keras_preprocessing.image import ImageDataGenerator
model = Sequential()
model.add(Convolution2D(filters = 30,
kernel_size = (3, 3),
activation = 'relu',
input_shape = (64, 64 ,3)))
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(Flatten())
model.add(Dense(units = 128,
activation = 'relu'))
model.add(Dense(units = 64,
activation = 'relu'))
model.add(Dense(units = 1,
activation = 'sigmoid'))
model.compile(optimizer = 'adam',
loss = 'binary_crossentropy',
metrics = ['accuracy'])
train_datagen = ImageDataGenerator(
rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory(
'/root/training_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
testing_set = test_datagen.flow_from_directory(
'/root/test_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
his = model.fit(training_set,
steps_per_epoch = 250,
epochs = 1,
validation_data = testing_set,
validation_steps = 800)
final_acc = his.history['accuracy'][(len(his.history['accuracy'])-1)] * 100
f = open("/root/output.txt", "w")
f.write("%d" % int(final_acc))
f.close()
You need not focus on the code, it is very basic code but the you shall focus on the last 3 lines, it will read create a output.txt file which will have the accuracy of our model. This output.txt is the same as used during the third job ( accuracy_check).
Github Repository (it contains all the codes and bash files):
https://github.com/cptn3m0grv/ML-automated
Thank you,
Gaurav Goyal