Email Spam Detection System with Airflow & Kubernetes

Just Published: My Third End-to-End MLOps Project – Email Spam Detection System Building on my previous MLOps projects, I’ve now developed a production-ready Email Spam Detection system with advanced workflow orchestration and deployment strategies. This project focuses on learning and applying Airflow, Kubernetes, and boosting models for scalable ML pipelines. The goal of this project was to build production-grade MLOps pipelines, including experiment tracking, pipeline orchestration, and containerized deployment, rather than deploying every model.” 📌 What I built • Full ML workflow with Airflow DAGs for pipeline orchestration, integrating MLflow for experiment tracking separately. • Multiple boosting classification models (AdaBoost, Gradient Boosting, etc.) running within the same pipeline. • Containerized deployment using Docker and orchestration using Kubernetes (Minikube + kubectl). • Modular project structure to handle data ingestion, preprocessing, model training, evaluation, and deployment. •I trained 2 models and deployed the best-performing one. The pipeline is modular and ready for future model experimentation. 🛠 Key Skills & Tools Python • MLOps • Airflow • MLflow (for tracking metric and logging and deployig trained models) • Docker • Kubernetes (K8s) • CI/CD • Boosting Models • Production-grade ML pipelines 💡 Improvements over my 2nd project • Airflow for robust workflow orchestration and explicit task dependencies. • Separate MLflow setup instead of ZenML’s built-in tracking. • Multiple models in the same pipeline, supporting experimentation with boosting classifiers. • Kubernetes basics & deployment: pods, deployments, services, and Minikube setup. • Learned how to serialize large objects, manage XComArgs, and troubleshoot common Airflow errors. • Hands-on experience with Docker + K8s deployment and managing containerized ML applications. 💡 Key Learnings • Airflow DAG structure, task dependencies, and XCom management. • Integration of Airflow with MLflow for reproducible experiment tracking. • Boosting models and running multiple classifiers in a single pipeline. • Kubernetes concepts for managing scalable containerized applications. • Ensuring version consistency between training and deployment environments. 🔗 Check out the repo here: https://lnkd.in/eVTYEm5C This project was a deep dive into scalable MLOps architectures, bridging the gap between pipeline orchestration, experiment tracking, and cloud-native deployment. #MLOps #MachineLearning #DataScience #Python #Airflow #MLFlow #Docker #Kubernetes #BoostingModels #PipelineOrchestration #ProductionML #ProjectShowcase #UKTech #TechJobsUK #UKJobs #HiringUK #LondonTech

To view or add a comment, sign in

Explore content categories