My Flink job is doing aggregations on top of event-time based windowing across Kafka partitions. As I have been developing and restarting it, the state for the catch-up periods becomes unreliable -- lots of duplicate emits for time windows already seen before, that I have to discard since my sink can't handle it. There may be a bug in my job, but I wanted to clarify whether this might be a flaw in Flink's handling of this.
I understand there is m:n mapping of partitions to sources depending on the parallelism. Each source will have its own watermark. During catchup, watermark progression can become pretty fragile, e.g. in my case where there's n partitions and parallelism is 1. I feel like some kind of event time alignment is needed across partitions. I may be completely off here, so I look forward to your thoughts on this! -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kafka-partition-alignment-for-event-time-tp4782.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.