Kafka as a Messaging System

Kafka as a Messaging System

Traditionally there were two type of messaging model: Queuing and publisher-subscriber and then Kafka came as a new messaging system.

Now before understanding about Kafka and its feature, we must understand about Queuing and publisher-subscriber. We need to understand pros and cons of these traditional model which lead the invention of Kafka.

Queuing – we all understand about queue and have read in data structure. A queue have two end front & rear and work in FIFO (First in First out) model. Same in queuing source produce data at one end and consumer consume it from other end.

It allows you divide up the processing of data over multiple consumer instances and which let you scale up processing strength of queuing but queues aren’t multi-subscriber. It mean once one process or consumer read one data, it’s gone, it’s consumed and it no more available for others.

To understand more let’s take real life example-

Suppose you are in queue before airport to get a taxi/cab. When taxi is stopping before queue, person is boarding in that and telling his destination and gone for their destination.

Now thing could it possible that two person have same cab and going for two destination in two different direction, definitely not.

Another example you take as your city water supply or amazon pkg delivery. It can’t be possible that same pkg is delivered to two people at two different address.

Now coming to our point, so queuing has one advantages that it can distribute the data processing load over multiple consumer & disadvantage that data cannot be available for multiple consumer.

Publish-subscribe: It allows you to broadcast the data to multiple consumer but unfortunately there is no way of scaling up since every message goes to every subscriber.

For example thinks about radio broadcast or live news. The same feed or stream is transmitting to everybody.

So with publish-subscribe there is one advantage is data is available for multiple consumer and disadvantage is it can’t scale up the process.

So we saw both have one pros and one cons and therefore Kafka came as a hybrid of queuing and publish-subscribe.

Kafka has concept of consumer group & it gives both pros. means it can divide data processing load over multiple members of a consumer group and same time it make same data available to multiple consumer groups

No alt text provided for this image

In traditional queuing model, it record data in-order but when multiple consumer consume data, the data order get lost as data has been distributed to multiple consumer.

Kafka does it better, it has the partition within topic. Kafka provide both ordering and parallelism over pool of consumer. Data are stored in partitions in order and it assign the partition in topic to the consumer in a consumer group. So each partition can be consumed by only one consumer in a consumer group. As we store data to many partition of a topic, we still get parallelism over pool of consumer.

No alt text provided for this image


To view or add a comment, sign in

More articles by Ashish kumar

Others also viewed

Explore content categories