Implementing Talend Job for Incremental Data Processing
Recently I was stuck with one logic for hours... That is, we have to separate the row based on the count that we are getting as a input. for example If I have 10 rows each 3 rows has to flow first then next 3,3,1.
After splitting it, it has to directly go to next flow. We shouldn't create temporary file for splitting the csv. It is easy task when we use tFileOutputDelimited because of split output in several files.
Now I'll show you another way implement this without creating intermediate files.
tLoop - This will be used to iterate the flow till the specified condition meets. Here I have given i=1 and it will loop till 3. Next one is tFileInputDelimited. This component help us to pick the file from specified folder.
After getting the data from csv, we need to implement the logic in tJavaFlex component. counter is 0 and batchSize is 3. Batch size is based on maximum how many data need to be there in each iteration. We can use tJavaFlex for looping. So, main code is used to loop the data and we can give the condition here. counter value has to be increment for every data flow. Here setNumber variable is to filter the data based on condition. Now setNumber is assigned incremental bases like 1 will be assigned to 1st 3 rows, 2 will be assigned to 2nd 3 rows from the input file and so on. Assign setNumber value in global variable.
Recommended by LinkedIn
Next component is tFilterRow. It will filter the rows based on the conditions. Already we have assigned setNumber for each rows. Now we need to filter the rows based on the setNumber condition. here i is from loop component. we need to select the use advanced mode check box to write the filter condition.
((Integer)globalMap.get("setNumber")) == i
Now job can able to produce the data as per the requirement. Use tLogRow to print the data.
this is not an incrmental load logic right?
Good Solution. However in the above job, you will end up reading the same file thrice
Well said Sneha A