Lambda Architecture in 2020

Andrew Jones

Published Jan 1, 2020

As I start to think about some of the upcoming projects we'll be working on over the next year and how we might go about building them, I wanted to consider where lambda architecture fits in our toolbox for building data services.

Having led a project a few years ago that tried to follow lambda architecture, I've got a decent feel for it's strengths and weaknesses. In my opinion, lambda architecture is based on two key assumptions:

1. Streaming processes are in some way imperfect (e.g. lossy, maybe slow for all but simple processing, unable to handle late or replayed data, unable to do complex aggregations, etc)

2. Users are happy to have either missing or approximated data for a while until the batch process completes

A problem for many use cases is that users do not want assumption 2, or do not understand why that should be the case, and therefore start losing trust in the whole system.

I also don't think assumption 1 holds up as much now we have frameworks like Beam and Flink, which are a lot better than what we had 3-5 years ago.

The exception to this might be if you really need (imperfect) data very, very fast. But even then, you would probably be better off simplifying your requirements rather than adding the complexity of this architecture.

I think lambda architecture is now a product of its (really very brief) time, and shouldn't be seriously considered for building data services these days. Having said that, the ideas and its approach to solving them are still interesting to learn from, so the book may still be worth a read if you are interested in the area.

---

Originally published at https://andrew-jones.com/blog/lambda-architecture-in-2020/

Cover image from Unsplash.

To view or add a comment, sign in

Lambda Architecture in 2020

Andrew Jones

More articles by Andrew Jones

Others also viewed

Kubernetes High-level Components Short notes

🚀 Building a Resilient Audit Log Processing System Using Outbox Pattern + Event-Driven Architecture

The Architecture Change That Outperformed My Entire RAG Stack

On the peril of confusing Streams and Messages - Part 1

Enterprise GraphRAG in Production: Architecture Bottlenecks, Failure Modes & Fixes

🚀 Architecting for Scale: Understanding the CAP Theorem

How I cope with large architecture model that constantly mutates to stay correct

The SAGA of transactions. Orchestration or Choreography?

Architect with Separation of Concerns

Data, Models, Architecture

Explore content categories

More articles by Andrew Jones

A Data Platform is not just for analytics

Data Contracts

We had an incident, and it was great

The democratisation of Data Science

What does a Tech Lead do?

The benefits of postmortems

You build it, you run it - you manage your data

If your data is worth storing, it's worth structuring

Momentum

Why your master datastore should be immutable