Hello Flink Community, Our flink application in v1.9, the basic logic of this application is consuming one large kafka topic and filter some fields, then produce data to a new kafka topic. After comparing the original kafka topic count with the generated kafka topic based on the same field by using presto query, it had slightly data loss (around 1.37220156e-7 per hour). The Original kafka topic is collecting data from mobile devices, it could have late arrival events. That's why we use processing time since order does not matter.
This job is using Processing time, any idea what could potentially cause this data loss? Also if flink is using processing time, what is the default time window? Will the default time window cause it? Appreciated for any suggestions. Thanks Best regards Rainie