Extracting data from Kafka and loading into DB
Learn and Grow

Extracting data from Kafka and loading into DB

Kafka is a distributed event store and stream-processing platform. It can publish and consume JSON and XML Structures.

Article content
Job design for Extracting data from Kafka and loading into DB

tKafkaInput: It helps to consume the data from kafka topic in string format. All the data is stored into kafka topic So we need to mention the kafka topic, consumer group id and broker list. here there is option called New consumer group starts from. it has 2 option, those are latest and beginning. Kafka is continuous streaming process. Beginning option will give all the data from kafka topic and latest option will give the data that are loaded after the job gets run.

Article content
tKafkaInput Configuration

tFlowToIterate: This component will iterate data one by one. It helps to get the data from kafka one by one and sending to next process. we cannot connect the tIterateToFlow to tWriteJSONFields. So in-between we are connecting tJava.

then tWriteJSONFields extracting all the fields from the JSON input payload from kafka.

Article content
tWriteJSONFields configurations

there are some transformation after extracting these fields from JSON. tMap is used for this transformation. In this requirement we need to change to date format. we'll see.

Article content
tMap Cofiguration

row2 is input side, Transform_Payload is outputside. when we click the top left side. there is a die on error option as mentioned in the below image. This will automatically give the other output field for capturing error. we'll see error handling in next article also mapping transformation.

Article content
Die on error

Once all the transformations are done. it has to load it into DB. tDBOutput component. here we have used oracle DB. action on table is mentioned for creating table, deleting table etc. here it is mentioned default, it will just insert the data to already created oracle table. action on data means inserting, deleting, updating the data. here it is insert operation. we'll see in detail in next article.

Article content
tDBOutput Configuration

Error part is just capturing in tJavaRow component or tLogRow.

Thank you for reading!...




















To view or add a comment, sign in

More articles by Sneha A

  • Python for Geographic Data Analysis - Chapter 1

    Python essentials Learning Objective: This presents some essential programming concepts and how to apply them in the…

  • Basic SQL questions

    Write the following queries in SQL, using the university schema Please, Take the sample data from attached link for…

  • Implementing Talend Job for Incremental Data Processing

    Recently I was stuck with one logic for hours..

    6 Comments
  • How to configure tESBConsumer

    It is similar to tRESTClient. It is used to call external API.

  • File upload or File download to the endpoint

    tHTTPClient tHTTPClient is a versatile component in Talend that enables you to access web services hosted on HTTP…

  • Talend ESB - SOAP listener

    Here, the scenario is listening the soap service continuously. Place tESBProviderRequest and tESBProviderResponse and…

  • Is JSON Structure handling difficult?

    Actually not, If you understand the JSON structure. In this article we are going to see about the objects within JSON…

    2 Comments
  • Dynamic Schema - DB to File

    The dynamic column retrieves the columns which are undefined in the schema. It can be only one column, also dynamic…

  • How to convert zipped binary data to String format?

    Here Source is kafka consumer component. It contains all the zipped binary data.

  • Talend Data Integration - Introduction

    There are many technologies booming now a days in the world. Cloud and Big data is one of the emerging technology in IT.

Others also viewed

Explore content categories