Getting started with Apache Kafka

No alt text provided for this image

To start with, let us understand what is Apache Kafka?

Apache Kafka is a distributed streaming platform with 3 capabilities

  1. Messaging System
  2. Store Stream with fault tolerance
  3. Process the stream data

Well, let us take a moment to understand each of them

Messaging System:

It is a message bus developed for high Ingres data, it allows access to published data and if needed replay the data i.e. It allows applications to process, persist and re-process streamed data.

We can actually divide huge projects in a small micro-services and use Kafka to communicate between these micro services

Store Stream with fault tolerance:

Since Kafka is distributed system, we can divide the data to be stored on different broker using replication_factor, if we have set the replication factor as 3, we can tolerate 2 node failures, in general the formula is replication_factor -1

Process the stream data:

Kafka provides streaming API to do data processing, I will cover that in future article

Basic concept of Kafka:

Topic: Unique name of a feed

Record: Smallest data and made up of key, value and timestamp

Partition: An ordered sequence of immutable record

Offset: Sequential ID assigned to Record

Broker: A node in a distributed system which forms Kafka Cluster

Broker ID: Each node is assigned with unique identifier

We have lot more like leader, group_id, replication_factor etc, which I would cover in other article

Installation and configuration of 3 Node cluster, since i am using Mac I will show in Mac, the command may not differ a lot in Linux

  • wget http://apachemirror.wuchna.com/kafka/2.5.0/kafka_2.12-2.5.0.tgz  (the URL will differ depending on version you want to install)
  • tar -xvf kafka_2.12-2.5.0.tgz 
  • cd kafka_2.12-2.5.0
  • bin/zookeeper-server-start.sh config/zookeeper.properties

create 3 copies of Kafka server configuration by copying and modifying config/server.properties file as config/server1.properties and server2.properties

  1. Change broker.id (just increment integer by 1)
  2. change port(just increment integer by 1)
  3. also good to change log_dir

then start all the 3 servers as follows:

  • bin/kafka-server-start.sh config/server.properties
  • bin/kafka-server-start.sh config/server1.properties
  • bin/kafka-server-start.sh config/server2.properties

Cool we have the servers up and running, I would recommend to look all the scripts in bin folder, it gives you the best tools to manage your Kafka Cluster

Let's create a topic:

We can you the tool bin/kafka-topics.sh to create list describe etc the topics

Command bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic fault_tolerated_topic --partition 3 --replication-factor 3

--create is an option to create --topic <name of the feed>

--partitions will tell how many partition the data would be done to

--replication_factor defines the level of fault tolerance

Awesome, we just created out first topic, let's see the detail of it with help of --describe option

bin/kafka-topics.sh --zookeeper localhost:2181 --describe --topic fault_tolerated_topic

Topic: fault_tolerated_topic PartitionCount: 3 ReplicationFactor: 3 Configs: 

Topic: fault_tolerated_topic Partition: 0 Leader: 2 Replicas: 2,0,1 Isr: 2,0,1

Topic: fault_tolerated_topic Partition: 1 Leader: 0 Replicas: 0,1,2 Isr: 0,1,2

Topic: fault_tolerated_topic Partition: 2 Leader: 1 Replicas: 1,2,0 Isr: 1,2,0

All right too much to grasp, let me continue from here in my next article on Kafka, for time being just keep playing with your Kafka setup :)




To view or add a comment, sign in

More articles by shamim khan

  • Anti Corruption Layer Pattern

    Anti Corruption Layer Pattern This pattern is more of mediation layer, some also refer this as adapter. This pattern…

  • What can we do?

    Nation Growth!! What does it mean? A question which we all should ask our selves. I am sure we all know that we are…

  • OSI Model, But WHY?

    OSI Model How many of us really know what is OSI Model? Rather why should i know? I will try to answer that before we…

  • Where are we heading with Automation?

    Where are we heading- with Automation? Today everyone is talking about automation. Training institutes are luring…

    4 Comments
  • Self Management Skills

    Mid level professionals are passionate and most of them show the passion to learn new technologies and in fact to…

Others also viewed

Explore content categories