Lambda Architecture for ML Models

Sunil Narsinghani

Published Mar 27, 2019

The Lambda architecture is layered data processing architecture. It is an architecture derived for processing large volume of data continuously. It has three layers, named Batch Layer, Fast Layer and Serving Layer. In Batch layer data is updated its layers keep data view updated in near real-time. Usually data view means a data table, which is used for reporting.

In ML models have been created. By definition ML, models are representation of real data in form of mathematical equations. Mathematical equations have parameters, derived in learning process. Hence, ML model can be considered as another form of data view. After all it's parameters are derived from real world data.

Real world data almost always arrives continuously in real-time mode. This makes ML model redundant frequently. As parameters do not reflect the newest data arrived.

Often the newest or latest information is most relevant information. In ML platforms, models frequently discarded and created afresh. If model is created over big data, it cost a lot more.

A newer, rather unexplored concept is upgrade the model. By means upgrade is do not discard but update its parameter values by processing latest data i.e. data arrived after previous update.

Fundamental question is how to update the Model , without (re)processing data again. And answer lies in mathematics followed in ML model building process. In ML models (Big) Data is processed in series of steps. A step represents tasks. In these steps many steps have property of additivity and commutativity. These two properties are the fundamental properties required to process data in parallel mode. If the setups are identified and matrix is updated in those tasks a model can be upgraded.

Now how Lambda architecture could be useful. With lambda architecture principal, first lets create a ML model with batch data or data at rest. Then update it continuously ( in a shorter frequency) using fast layer with hot data. Then serving layer will always have best possible Ml model for scoring.

Now how this can be implemented. To implement two fundamental things about data and models need be considered. First ML models input data is converted to matrix forms and subsequent steps are revised matrix representing the same data. Second ML models are optimization models or problems, iterative in nature. They iterate with a solutions as input and a revised (improved solution) as output.

In subsequent posts more details about model upgrade process will be discussed.

Vinayak Pai 7y

Interesting thoughts. Waiting for the next article

To view or add a comment, sign in

Lambda Architecture for ML Models

Sunil Narsinghani

More articles by Sunil Narsinghani

Others also viewed

Distributed Complex Event Processing and data locality

The Evolution of Data Engineering, Architecture, and Platform Engineering: Leveraging Generative AI as a Strategic Enabler

Beyond Chatbots: Solving API Drift and Data Consistency

AI in Data Engineering: Are We Moving Beyond Coding?

Knowledge Graphs, Part III

Why Medallion Architecture Breaks for RAG Pipelines & what actually works

Why a Data Engineering Suite Still Needs Modernization

Beyond the Basics: Orchestrating Intelligent Data Workflows with FSM & Containerized Modularity 🚀

Turning a Serial Pipeline Into a Parallel Processing System Using Threads & Processes

Behavior driven data ecosystems

Explore content categories