Hi Nishu, the data loss might be caused by the fact that processing time triggers do not fire when the program terminates. So, if your program has records stored in a window and program terminates because the input was fully consumed, the window operator won't process the remaining windows but simply be canceled.
Best, Fabian 2017-12-07 23:13 GMT+01:00 Nishu <nishuta...@gmail.com>: > Hi, > > Thanks for your inputs. > I am reading Kafka topics in Global windows and have defined some > ProcessingTime triggers. Hence there is no late records. > > Program is performing join between multiple kafka topics. It consists > following types of Transformation sequence is something like : > 1. Read Kafka topic > 2. Apply Window and trigger on kafka topic > 3. Parse the data into POJO objects > 4. Group the POJO objects by their keys > 5. Read other topics and perform same steps > 6. Join the Grouped Output with other topic Grouped records. > > I get all records until 3rd point as expected. But in point 4, few keys > are dropped with inconsistent behavior in each run. > I have tried the pipeline with different-2 setup i.e 1 task slot, 1 > parallel thread, or multiple task slot n multiple thread. > > It looks like BeamFlink runner has some bug in the pipeline translation in > streaming pipeline scenario. > > Thanks, > Nishu > > > On Thu, Dec 7, 2017 at 7:13 PM, Chen Qin <qinnc...@gmail.com> wrote: > >> Nishu >> >> You might consider sideouput with metrics at least after window. I would >> suggest having that to catch data screw or partition screw in all flink >> jobs and amend if needed. >> >> Chen >> >> On Thu, Dec 7, 2017 at 9:48 AM Fabian Hueske <fhue...@gmail.com> wrote: >> >>> Is it possible that the data is dropped due to being late, i.e., records >>> with timestamps behind the current watemark? >>> What kind of operations does your program consist of? >>> >>> Best, Fabian >>> >>> 2017-12-07 10:20 GMT+01:00 Sendoh <unicorn.bana...@gmail.com>: >>> >>>> I would recommend to also print the count of input and output of each >>>> operator by using Accumulator. >>>> >>>> Cheers, >>>> >>>> Sendoh >>>> >>>> >>>> >>>> -- >>>> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4. >>>> nabble.com/ >>>> >>> >>> -- >> Chen >> Software Eng, Facebook >> > > > > -- > Thanks & Regards, > Nishu Tayal >