Hi!

The FlinkKafkaConsumer can handle watermark advancement with
per-Kafka-partition awareness (across partitions of different topics).
You can see an example of how to do that here [1].

Basically what this does is that it generates watermarks within the Kafka
consumer individually for each Kafka partition, and the per-partition
watermarks are aggregated and emitted from the consumer in the same way that
watermarks are aggregated on a stream shuffle; only when the low watermark
advances across all partitions, should a watermark be emitted from the
consumer.

Therefore, this helps avoid the problem that you described, in which a
"big_topic" has subscribed partitions that lags behind others. In this case
and when the above feature is used, the event time would advance along with
the lagging "big_topic" partitions and would not result in messages being
recognized as late and discarded.

Cheers,
Gordon

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/event_timestamps_watermarks.html#timestamps-per-kafka-partition



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply via email to