Apache Kafka: All about Kafka topic with example
What is KAFKA TOPIC:
“a particular stream of data”
In a Kafka cluster, you can have multiple topics. If you compare it with a database topic, it is a kind of table that will store some set of data. For example, check the image for Kafka cluster topics. You can create an “N” number of topics with no limitations.
We can identify the topic with a 'name'. For example, batch_process_status is a topic.
Any data format, such as CSV, JSON, or AVRO, is supported by the Kafka topic.
The sequence of topics is called "data streaming."
Topics are like tables, but you can’t query them. (If you want to read the data, you need to create the Kafka consumers and read the data).
Each topic has data blocks called partitions. Sample partitions are below.
Each partition has an order, and each partition has different values with no dependency on the other partitions.
Partition has an order and incremental ID called “offset”
Each partition has a different offset (this offset going to play a key role in data movements)
Kafka topics are immutable once you add the data to the partition. We can't change the partitions, but we can replace them with other data.
Recommended by LinkedIn
Kafka Topic Example :
Use case :
An application owner/stakeholder has 4 different ETL applications. At any point in time, they want to track the batch process status, application health, and user entitlements in a single window.
Each application will send a message to Kafka every 30 seconds. Each message will contain the batch process status job level.
You can have a topic 'ETL_batch_process_status' that contains the track of all applications.
We chose to create that topic with 10 partitions (arbitrary number).
Limitations of Kafka Topic
Once the data is written to the partitions that can’t be changed, -- immutable.
Data is kept only for a limited time in the partition (default 7 days – we can configurable)
An offset only has a meaning for a specific partition.
Data is assigned randomly unless a key is provided.
We can create many partitions per topic.
Thank you!