If you are working with modern, large-scale distributed systems, Apache Kafka is almost unavoidable. Originally detailed in a 2011 white paper by engineers at LinkedIn, Kafka fundamentally changed how we handle massive streams of data.
Here is a breakdown of Kafka’s architecture from a system design perspective.
1. The Problem Kafka Solved
Before Kafka, data integration was a point-to-point mess. Every data source (databases, app logs) was directly wired to every destination (data warehouses, monitoring dashboards). Traditional message queues (like RabbitMQ) stored messages in memory and deleted them once read, failing under the massive throughput required for web-scale event tracking.
The Solution: A unified, high-throughput, decoupled platform built on a fundamentally different abstraction: The Distributed Commit Log. Kafka appends immutable data to the end of a file sequentially, taking advantage of modern OS disk caching to make disk reads/writes incredibly fast.
2. Core Architecture: The Basics
Kafka's scalability relies on a few core primitives:
- Topics & Partitions: A topic (e.g., user_clicks) is logically how data is categorized, but physically, it is split into Partitions. Partitions are distributed across different servers (brokers). Data inside a partition is strictly ordered and assigned a sequential ID called an offset.
- Producers & Consumers: Producers write data to topics (often hashing a key like user_id to ensure related events hit the same partition). Consumers read data, tracking their place using the offset.
- Consumer Groups: This is how Kafka scales consumption. A consumer group is a team of consumers reading a topic. The Golden Rule: Each partition can be read by exactly one consumer within a single group. If you add more consumers than partitions, the extra consumers sit idle.
3. Data Management: Compaction and Tombstones
Unlike traditional queues, Kafka retains messages for a configured time (e.g., 7 days) or size (e.g., 50GB) before deleting the oldest segments.
For scenarios where you only care about the latest state (like streaming database updates), Kafka uses Log Compaction. A background thread removes older records that share the same key as a newer record. To delete a record entirely, producers send a Tombstone—a message with the target key but a null payload. The compactor uses this as a delete marker to eventually scrub the key from the system.
4. High Availability and Fault Tolerance
Kafka survives broker crashes through replication, configured by a Replication Factor.
- Leader and Followers: One broker is the Leader for a partition, handling all reads and writes. The others are Followers, passively replicating the data.
- The ISR (In-Sync Replicas): Kafka tracks which followers are actively keeping up with the leader. Only brokers in the ISR are eligible to become the new leader if the current one crashes, preventing data loss.
- The Shift to KRaft: Historically, Kafka relied on an external system, Apache ZooKeeper, to manage metadata and leader elections. This became a bottleneck at scale. Modern Kafka uses KRaft (Kafka Raft), integrating a consensus protocol directly into Kafka. Metadata is now treated as an internal Kafka topic, allowing for instant failovers and the ability to scale to millions of partitions.
5. Exactly-Once Semantics (EOS)
In a distributed system, network timeouts are inevitable. If a producer's message is written, but the network drops the acknowledgment (ack), the producer must retry, potentially creating duplicate data.
Kafka solves this through:
- The Idempotent Producer: Kafka assigns producers a unique ID and requires them to send sequence numbers with messages. If a broker receives a duplicate sequence number due to a retry, it silently drops the duplicate but sends back a successful ack.
- Transactions: For complex "Read-Process-Write" loops, Kafka uses a Transaction Coordinator and a Transaction Log (similar to a two-phase commit). It writes a "Commit Marker" to the topic only when the entire transaction succeeds.
- Read-Committed Consumers: Consumers can be configured to pause reading if they hit an open transaction, waiting for the commit or abort marker before proceeding.
Kafka’s design is a masterclass in leveraging simple file system mechanics and clever distributed consensus to solve incredibly complex data problems at scale.
- Kreps, J., Narkhede, N., & Rao, J. (2011). "Kafka: a Distributed Messaging System for Log Processing." Proceedings of the 6th International Workshop on Networking Meets Databases (NetDB '11).
- Apte, N., & Gustafson, J. (2017). "KIP-98 - Exactly Once Delivery and Transactional Messaging." Apache Kafka Wiki.
- Rooman-Kymans, C., et al. (2019). "KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum." Apache Kafka Wiki.
- Apache Software Foundation. (2024). "Apache Kafka Documentation." Apache Kafka Official Website.
- System Design Course: Gaurav Sen System Design
Really clean breakdown of Kafka's core concepts. The part about Exactly-Once Semantics is what most tutorials skip completely. Duplicate messages caused by network retries is a real production problem — and most engineers only discover it after something goes wrong in their payment or order system. Idempotent producers solve it cleanly without any extra complexity on the consumer side.