Flink application has slightly data loss using Processing Time

Rainie Li Mon, 08 Mar 2021 00:47:07 -0800

Hello Flink Community,

Our flink application in v1.9, the basic logic of this application is
consuming one large kafka topic and filter some fields, then produce data
to a new kafka topic.
After comparing the original kafka topic count with the generated kafka
topic based on the same field by using presto query, it had slightly data
loss (around 1.37220156e-7 per hour).
The Original kafka topic is collecting data from mobile devices, it could
have late arrival events. That's why we use processing time since order
does not matter.


This job is using Processing time, any idea what could potentially cause
this data loss?
Also if flink is using processing time, what is the default time window?
Will the default time window cause it?

Appreciated for any suggestions.
Thanks
Best regards
Rainie

Flink application has slightly data loss using Processing Time

Reply via email to