Cloud Machine Learning Playground: Training and Processing ML Models in an Infrastructure-Agnostic Setup

Tariku Tessema

Published Apr 24, 2024

I have been exploring simple and scalable integration concepts for training and processing machine learning algorithms in my spare time. While the idea of continually training models and deploying them (MLOps) is not new, most off-the-shelf solutions come with additional complexities that require developers or teams to be familiar with various details specific to a particular cloud vendor.

In addition to elasticity—the ability to grow or shrink infrastructure based on business needs—one of the promises of the cloud is the possibility of creating a truly infrastructure-agnostic setup. Unlike cloud-native development, cloud-agnostic design avoids some of the pitfalls of cloud development by abstracting commonalities across various areas of concern and building processes around open-sourced and widely available components.

This sandbox project aims to explore cloud-agnostic tools that can be used for training and processing ML models.

At a high level, the vision for this sandbox project is to have a set of repositories that can be run both locally and in the cloud, working together seamlessly to train and process simple ML algorithms.

Below is a simple diagram illustrating the setup:

Technologies Used:

Scikit-Learn
Flask
NestJS
Angular
Docker
Kubernetes

Model Training:

In the Training Docker container, we utilize data from Kaggle related to Emotion Classification NLP . The data is split into 80% for training and the remaining 20% for testing using stratified sampling. Scikit-Learn processes the sample data to create an ML model capable of predicting emotions associated with given text, categorizing them into “Anger,” “Joy,” “Fear,” and “Sadness.”

Flask App for Model Serving:

Following training, the model is exported to another application that integrates it with a Flask server. This setup allows scoring via an accessible communication interface—specifically, an HTTP request.

NestJS for Business Model Processing:

NestJS serves as the backend for our solution, consolidating all business model processing and updates in a central location. The NextJS application provides an additional layer in the architecture, isolating specific business concerns. This approach aligns well with the project’s scope and data storage requirements , paving the way for a future isolated microservices setup.

Recommended by LinkedIn

MLOps with AWS Sagemaker

Amit Sharma (Ph.D.) 3 years ago

Implementing Robust FinOps Strategies for Generative…

Phyllis M 1 year ago

AWS Inferentia & AWS Trainium

Nadir R. 1 year ago

Networking and Connectivity:

Given that our solution relies on individual containers, establishing communication between them is crucial. Here’s how the containers are interconnected for local and cloud scenarios:

Local Networking: When running the solution locally, Docker containers communicate via a user-defined bridge network. Aliases facilitate inter-container communication. (Note: The legacy -link and alias options in Docker networking have been replaced by Bridge networks and IP addresses.)
Cloud Networking: In the cloud environment, NGINX handles ingress, routing traffic between containers.

Consuming the Algorithm:

A sample Angular application sends requests to a NestJS server, consuming the ML algorithm’s results.

Deploying to the Cloud:

To maintain cloud-agnosticism, we can use Kop (Kubernetes Operations) to abstract specific cloud implementations by various vendors. This approach allows deploying the solution to Kubernetes services across AWS, Azure (currently in Alpha), and Google Cloud.

For your reference, here are the source codes for the solution:

Github: Docker-based solution

References:

[1] Emotion classification dataset

[2] Y. Abgaz et al., “Decomposition of Monolith Applications Into Microservices Architectures: A Systematic Review,” in IEEE Transactions on Software Engineering, vol. 49, no. 8, pp. 4213–4242, Aug. 2023, doi: 10.1109/TSE.2023.3287297.

Marcos Esteve: Deploying an NLP model with Docker and FastAPI

To view or add a comment, sign in

Cloud Machine Learning Playground: Training and Processing ML Models in an Infrastructure-Agnostic Setup

Tariku Tessema

Recommended by LinkedIn

More articles by Tariku Tessema

Others also viewed

Preparing for the AWS Certified Generative AI – Professional Certification

Leveraging VertexAI for Custom Training/Prediction

Leveraging AWS AI and GenAI for an Intelligent Document Processing

Master AWS & Generative AI: 3 Courses for Real-World AI Systems.

Serverless Recommendation System using AWS

Confused About SageMaker Algorithms? Here’s How to Choose the Right One Every Time

AWS Strands Agents SDK: A Game Changer in Agentic AI

Google Cloud Professional Machine Learning Engineer Certification: Post Exam Impressions

From MLOps to LLMOps: Why Cloud Engineers Need a New Playbook for GenAI

Explore content categories

Recommended by LinkedIn

More articles by Tariku Tessema

Engineering Learnings: Building a Map-Native Architecture

The Good, the Bad, and the Code: ASP.NET Starter Samples and Beyond

Can GPTTurbo Extract Places in "Kokomo" into JSON?

Making Unicorns Fly with Python

Using Vertex AI to Match Nonprofits with Customers: A Hackathon Experiment

Others also viewed

Preparing for the AWS Certified Generative AI – Professional Certification

Leveraging VertexAI for Custom Training/Prediction

Leveraging AWS AI and GenAI for an Intelligent Document Processing

Master AWS & Generative AI: 3 Courses for Real-World AI Systems.

Serverless Recommendation System using AWS

Confused About SageMaker Algorithms? Here’s How to Choose the Right One Every Time

AWS Strands Agents SDK: A Game Changer in Agentic AI

Google Cloud Professional Machine Learning Engineer Certification: Post Exam Impressions

From MLOps to LLMOps: Why Cloud Engineers Need a New Playbook for GenAI

Similar topics

Machine Learning Deployment Approaches

Explore content categories