Weekend Curiosity to Practice - Software Engineering to Machine Learning (Ops)

Weekend Curiosity to Practice - Software Engineering to Machine Learning (Ops)

Hey everyone, 

Lately, I have been diving deep into the world of machine learning beside my software development skills after era of LLMs. Last one year I have a quite curiosity and interest in that fields but especially after i started my master degree it just became to kind a profession instead of just hobby, juggling weekend research and projects and open source repos etc. And then I've noticed some new things and thoughts, I think good to share for fellows working in this field. (FYI, this is not GPT generated post 🤖 just me.)

It seems that many small to medium-sized teams and companies jumping into AI, building new models and applications for production, are hitting some snags. The usual software development practices don't always align well with the unique demands of ML processes and flows, also it may lead to inefficiencies and performance/cost headaches. I know there is a lot of talented folks who are quite good in the ML/Data (scientific) side but might be not enough when it comes to the engineering part. And on the flip side, there are great engineers eager to jump into ML, but they're generally stops at the where develop GPT powered apps has began(MCP, RAG , q&a retrievers etc.) I know these areas and concepts are cool / popular and there is no back from this now on.

Also there are those who want to go a little further into this field and develop their own custom models on their own servers, in completely private systems. But common pitfalls I've seen is how traditional DevOps and software development approaches like stateless systems, simplistic load balancing that destroys cache efficiency, CI/CD pipelines that lack data and model versioning, and basic autoscaling that fails to account for GPU and data transfer bottlenecks don't always play nice with ML. For example, in classic request-response based web apps and inferences we generally tryin to seperate load across different instances. But in ML model deployments, this can kill cache efficiency and may lead to cold starts for each similar query. That means wasted compute power, higher costs and negative impact on environmental sustainability and energy efficiency.

Instead, we should be looking at things like sticky sessions, stateful inference, batch processing, dynamic resource scaling, data management pipelines etc. By using weighted algorithms that consider cache status and resource usage, we can really boost performance and cut costs. There are some awesome open-source tools and frameworks and most of them compatible Kubernetes ecosystem that can help with this.

The MLOps field is still pretty new, and also finding people who are equally skilled in software, devps/infra, and machine learning a bit hard. That's why I think there can be opportunity here for those of us willing to bridge that gap. But also it may turns into a bit of a tradeoff at larger scale. Because when corporate companies or growing ones, they usually demand deeper expertise instead of Swiss army knife.

If you are working or trying to work in this fields i want to know your thoughts. Have you encountered similar challenges or found some cool solutions? Do not hesitate to direct message to me.

Paylaştığınız için teşekkürler, Huseyin Isik

Like
Reply

your journey from hobby to serious ml shows amazing growth potential.

To view or add a comment, sign in

Others also viewed

Explore content categories