Hello everyone,
First,a brief pipeline introduction:
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
consume multi kafka topic
-> union them
-> assignTimestampsAndWatermarks
-> keyby
-> window() and so on …
It's a very normal way use flink to process data like this in production
environment.
But, If I want to test the pipeline above I need to use the api of
FlinkKafkaConsumer.setStartFromTimestamp(timestamp) to comsume 'past’ data.
So my question is how to control the ’step‘ banlence as one topic produces
3 records per second while another topic produces 30000 per second.
I don’t know if I describe clearly . so any suspicion please let me know
Tks