Extracting data from Kafka and loading into DB
Kafka is a distributed event store and stream-processing platform. It can publish and consume JSON and XML Structures.
tKafkaInput: It helps to consume the data from kafka topic in string format. All the data is stored into kafka topic So we need to mention the kafka topic, consumer group id and broker list. here there is option called New consumer group starts from. it has 2 option, those are latest and beginning. Kafka is continuous streaming process. Beginning option will give all the data from kafka topic and latest option will give the data that are loaded after the job gets run.
tFlowToIterate: This component will iterate data one by one. It helps to get the data from kafka one by one and sending to next process. we cannot connect the tIterateToFlow to tWriteJSONFields. So in-between we are connecting tJava.
then tWriteJSONFields extracting all the fields from the JSON input payload from kafka.
there are some transformation after extracting these fields from JSON. tMap is used for this transformation. In this requirement we need to change to date format. we'll see.
row2 is input side, Transform_Payload is outputside. when we click the top left side. there is a die on error option as mentioned in the below image. This will automatically give the other output field for capturing error. we'll see error handling in next article also mapping transformation.
Once all the transformations are done. it has to load it into DB. tDBOutput component. here we have used oracle DB. action on table is mentioned for creating table, deleting table etc. here it is mentioned default, it will just insert the data to already created oracle table. action on data means inserting, deleting, updating the data. here it is insert operation. we'll see in detail in next article.
Error part is just capturing in tJavaRow component or tLogRow.
Thank you for reading!...
Recommended by LinkedIn