MACHINE LEARNING WITH DEVOPS INTEGRATION MAKES MLOPS
In our data science world data scientists need to change the model several times to find the best accuracy model manually. This took a lot of time and manpower for making a machine learning or deep learning model precisely. In data science, there is no shortage of cool stuff to do the shiny new algorithms to throw at data.
so now we can to it by Automation with Jenkins and Docker-Container.
We can do it by the following:
1. Create container image that’s has Python3 and Keras or NumPy installed using docker file
2. When we launch this image, it should automatically starts to train the model in the container.
3. Create a job chain of job1, job2, job3, job4 and job5 using build pipeline plugin in Jenkins
4. Job1: Pull the Github repo automatically when some developers push the repo to Github.
5. Job2: By looking at the code or program file, Jenkins should automatically start the respective machine learning software installed interpreter install image container to deploy code and start training( eg. If code uses CNN, then Jenkins should start the container that has already installed all the software required for the CNN processing).
6. Job3: Train your model and predict accuracy or metrics.
7. Job4: if metrics accuracy is less than 80% , then tweak the machine learning model architecture.
8. Job5: Retrain the model or notify that the best model is being created
9. Create One extra job job6 for monitor: If the container where the app is running. fails due to any reason then this job should automatically start the container again from where the last trained model left
---------------------------------------------------------------------------------------------------------------
Firstly we have to create a docker image for setting the environment of training. Following is the code. Save the file with a name Dockerfile. Don't save with any other name else it will not work because the name of the file is predefined we can't do any change that name in our case is Dockerfile
FROM python:3.6-slim RUN pip install tensorflow RUN pip install keras RUN pip install scikit-learn RUN pip install numpy RUN pip install pandas RUN pip install open-cv
Save this docker file (Dockerfile) and then run the following command to create a new docker image.
Now our docker image is ready and using this image we can launch multiple environments. By using this command
docker build -t <name>:<tag/version> <location of dockerfile>
The next step is to create a GitHub repository where the developer or data scientist will upload/commit all the files required for the training of a model. So let's do it very quickly.
After you create repository clone it into your PC and add a post-commit script in .git/hooks/ so that whenever you commit it automatically pushes the files to GitHub. Following is the code for writing a post-commit script. Paste the code in any text editor and make sure you save the file in .git/hooks folder of your repository and the name should be post-commit (Make sure do not give any extension even .txt is also not allowed)
#!/bin/bash
git push echo "successfully pushed" # As a confirmation message
Now let's move to create jobs of Jenkins.
Job1:
Job2:
Our job2 is dependent on job1 so we use a trigger named "Build after other projects".
above code is clearly visible down here
if sudo cat /root/MLOps/Python/main.py | grep keras then if sudo docker ps | grep neuralnetwork then echo "Already Running" else sudo docker run -itd -v /root/MLOps/Python:/home --name neuralnetwork tensorflowimage fi else if sudo docker ps | grep sklearn then echo "Already Running" else sudo docker run -itd -v /root/MLOps/Python:/home --name sklearn tensorflowimage fi fi
Job3:
So again the job3 is built after the job2 built. for that, we have to build a trigger. This job is also triggered when the accuracy is not up to requirement so for that Remote Triggering is used with Authentication token
We need to execute a shell for executing the python file which will start the training model. The file main.py will take all its values of hyperparameters from a CSV file named hyperparameters.csv so that when next time the model is put again in training it can take changed values
above mention, code is cleary mention down here.
if sudo cat /root/MLOps/Python/main.py | grep keras then sudo docker exec neuralnetwork python /home/main.py else sudo docker exec sklearn python /home/main.py fi
Job4:
We need to check the accuracy of the trained model for checking a file after execute the file which returns the accuracy and then using shell script then accuracy is compared if the accuracy is up to requirement(which is in our case ios min. 80% ) then the execution of chain is stopped and if accuracy is not good (less them 80% ) then further jobs are executed
shell script for that is down here let's check it.
if sudo cat /root/MLOps/Python/main.py | grep keras then accuracy=$(sudo docker exec neuralnetwork python /home/accuracytest.py) else accuracy=$(sudo docker exec sklearn python /home/accuracytest.py) fi if (( $accuracy >= 80 )) then sudo python3 /root/MLOps/Python/success.py exit 1 else sudo python3 /root/MLOps/Python/fail.py fi
The file success sents a mail to the developer notifying that successful completion of the model.
Down here is the message sent by the file
HEY!! DEVELOPER THE FILE IS READY TO GO:) WITH AWESOME ACCURACY:) LET'S CHECK IT OUT:)
Job5:
The job will execute if the accuracy is not good and change some parameters. After changing parameters it will trigger the job3 again. The new file will be executed in this job which changes the parameters and stores it into a CSV file containing values of all hyperparameters. As mentioned in job1 the python file take values of all its hyperparameters from this file so that whatever changes are to be done in training the model again can be taken into consideration
The shell script is down here for job5
sudo python3 /root/MLOps/Python/tweakparameters.py sudo
curl http://192.168.43.194:8080/job/StartTraining/build?token=starttraining
Job6:
If due to some circumstances the container on which training is running stops and the environment needs to be restarted again so job6 will keep track of every container and relaunch the container if it is stopped.
Shell Script is down here for job6
if sudo docker ps | grep neuralnetwork then echo "Everything is OK environment running properly else sudo docker run -itd -v /root/MLOps/Python:/home --name neuralnetwork tensorflowimage fi if sudo docker ps | grep sklearn then echo "Everything is OK environment running properly else sudo docker run -itd -v /root/MLOps/Python:/home --name sklearn tensorflowimage fi
The job should be able to continuously monitor so we need to make the job build periodically after every hour ( or as per your requirement ).
This is only done with the help of Jenkins...
By this-
We can integrate ML with DevOps to achieve the wonderful project which can make the life of data scientists easy by saving a lot of time which was invested in the trial and testing of the model...It all about automation:)
---------------------------------------------------------------------------------------------------
THANKS FOR YOUR ATTENTION...