A practiced and practical approach to capsule Machine learning into microservices
Microservices along with machine learning introduces a high-speed IT architecture for companies across range of industries seeking digital transformation to deliver & meet an improved seamless multi-channel customer experience, digital business model and real time intelligence. Microservice and Machine Learning are two orthogonal concepts because of their evolution, architecture style, underlying reference technology stack, algorithmic trends, affinity to monolith, development life cycle and test strategy. As “Thinking digital apps” is becoming tagline of every industry, marriage of these two orthogonal concepts will play an important role to address evolving prominent market demands like AI &Analytics, agility & speed, web scale, lower cost, reduced cycle times , innovation and fail fast strategy. Companies delivering digital services are in the process of experimentation to propose a standard production scale framework to pack machine learning into Microservices Architecture in an optimized fashion.
Microservices are practically loosely coupled and independent services in terms of functionality, technology, task and deployment. Microservices Architecture (MSA) is a pure application architecture style, in which large complex software applications are composed of one or more services based on independent functionality. The main advantage of microservice is its ability to add or modify a new or existing functionality very quickly without impacting rest of the functionalities in the application. Since MSA implies greater agility, availability and scalability, these Microservices eventually act as building blocks for building modern enterprise applications and become one of the enablers in the digital transformation journey.
In spite of machine learning gaining tremendous popularity in solving real-life problems , still a major challenge persists in deploying the models to production environment due to the ever changing training data sets. The desire to use machine learning for practical applications faced multiple hurdles with pre-built offline models as enterprise data sets are always evolving. Every delta addition of new datasets needed rebuilt of the model again, compute the scores again and deploying of the new models without effecting the production environment. Microservices architecture acts as a natural enabler in achieving this seamless integration without disrupting the production environment.
Practiced Best Practices
Build & Deployment: One of the key aspects behind any successful AI/ ML application lies behind the fact that how ease the models can be build and deployed. With the evolution of deep learning and with the fact that a proper LSTM based deep learning model needs GigaBytes of data for building an optimized model, it became immensely important to derive a best practice so that building and deployment of models can happen without even hampering the production environment. A commonly used approach for most of the cases is to use a robust training dataset for generating an optimized model offline and validate the model using pre-defined validation dataset offline and then upload the model file in the underlying microservices. The central messaging hub used in Microservices architecture can be used to collect various feature data from different sources in real time. In case of bulk (feature ) data transfer , some ETL tool or batch program can be provisioned. This data is used to build models .The model building could be a big data infrastructure like Hadoop or a simple in-house python script. Once prepared and tested offline, the model can be pushed to the associated microservice.
Packaging :: Microservice architecture enable developers to achieve both 1) Publish a sole ML capability as Microservice or 2) Publish an independent functionality as Microservice with ML capability. In both the cases, Machine Learning component ( As a whole or part ) to be handled in a separate way because of the various separate aspects starting from data acquisition , model building/training and model updating involved in Machine Learning and Cognitive Computing . Once a data scientist finalizes the algorithm and underlying technology, delivery IT team should decide the appropriate integration tooling to publish the ML functionality as service API . For example a model ( file ) built offline using Python script can be kept in file storage or data base of a machine . Integration team can build a python interface program to connect the model by accessing the file or database system . Later a rest API can be provisioned in the same machine to publish the functionality in a secured fashion .To do so , a combination of Python runtime and production scale webserver could be chosen for the microservice .Once the team decides the API for outside world , an API management system can be thought on top of it . Below is the anatomy of ML Microservice for reference
Containerization & Management:To take or integrate the machine learning algorithm into delivery IT platform , there are normally two steps .First major step is to enable the REST API on top of all the complex machine learning algorithm running behind the microservice .Then packing the API , run time , algorithm in some Docker Container .The second major step is to manage and orchestrate all the containers (microservices) more effectively .To achieve the same,Kubernetes handles the work of orchestrating containers onto a compute cluster and manages the workloads to ensure they run as the user intended. Apart from orchestration, Kubernetes also helps the above microservices for easy configuration including discoverability, observability, updates, horizontal scaling, and load balancing. But the challenge is where we’ll keep the models generated and updated offline .This is definitely not in container microservices as containers miss a major mark in terms of data persistence. Containers cannot maintain data persistence when rescheduled or destroyed.In case of machine learning microservices, algorithms packed in containers are supposed to read the models which are continuously built, updated and pushed through some offline process. As a results, we can keep the models outside the microservices as separate entities .Models can be stored in NFS file systems or databases. In case of NFS, docker can use its directory mount capabilities. Directory mounts can be shared with other containers and data persists after the container has stopped and been terminated.
Scaling & Performance:Like any other SaaS components, a primary need for ML microservices is to ensure the scalability of its services without effecting the performance. To deal with the challenges of scaling ML microservices, our approach is very simple and straight forward. Goal is to enable auto scaling everywhere in the components of every service .Since the foundation of the microservices is built on the cloud foundry and containers, scalability became core and integral part of the solution. While choosing and building different components, architects and consultants should give special care in choosing and finalizing different runtimes, databases, storages and computing engines in terms of scalability. In case any component is deployed in pure IaaS like Virtual Server, special care needs to be given as demand spikes additional resources can be allocated at the push of a button . We have seen most of the leading cloud vendors are offering seamless on demand vertical scaling of resources that need some sort of manual trigger to provision. On the other hand, component deployed in PaaS services & containers are mostly enabled with Auto Scaling. Any spike in load will automatically scale up the resources horizontally and delete or turned off resources when the demand subsides. While architecting ML microservices, two approaches can be practiced. In the 1st approach, we can pack all the required components in a container and then scale the container itself based on demand. In the 2nd approach, we can only pack the custom algorithm in a container or virtual server. And rest of the components can be targeted to appropriate PaaS services. While the 1st approach effectively helps in complete separation and modularization, 2nd approach provides flexibility in terms of optimized resource management.
Real-time Adaptability:In above sections, we have shown how ML microservices can use the models those were trained offline. We have seen situations where dynamic features become part of model which in turn demands real time updation of underlying models at regular intervals or in other words, updating the model with the modified training samples in real time. The case, where you really need to update the model or retrain completely, is critical as we need a solid strategy and decisions that can vary from one case to another
There may be situations where there is no need to retrain the model with the arrival of new data, rather we can rethink of over fitting the new data in existing model. For example if we use procedure such as stochastic gradient descent (SGD), in which the true gradient can be approximated by the gradient at a single new sample which is actually treated as a hyper parameter.
But there may be situations where ML algorithms will learn incrementally over the data. That is, the model is updated each time it sees a new training instance. Another situation where we simply buffer the relevant data and retrain our model frequently. For such situations, we should organize and design the model in a such a way so that it can accommodate historical data as well as data generated in recent past.
Conclusion
With e-retailers and social majors like Amazon, eBay and Twitter moving towards micro services from the till-now convenient monolithic architecture, it is obvious that micro services are evolving very fast as the preferred architecture for large enterprise solution. Some of the major advantages which supports the evolution of micro services architecture are jotted above. On the other hand, rapid digital transformation is going forward due to disruption caused with machine learning and new era deep learning. In the era of cognitive analytics evolving through the path of predictive analytics shows a demand for building and deploying machine learning models not only based on historical data but also on streaming data. One of the major factor for the acceptability and sustainability of AI and ML is definitely going to be its applicability in real time applications like fraud detection and personalization. Docker containers have definitely given an edge for deployment of ML models over conventional VMs due to its lightweight deployment, effectiveness in scalability and efficient automation process. However, technology is still evolving and lot more optimization is definitely going to happen in near future
thought full
Good job, need following one on implementation perspective