Hi, Can not you write the watermark as a special event to the "mid-topic"? In the "new job2" you would parse this event and use it to assign watermark before `xxxWindow2`? I believe this is what FlinkKafkaShuffle is doing [1], you could look at its code for inspiration.
Piotrek [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/api/java/org/apache/flink/streaming/connectors/kafka/shuffle/FlinkKafkaShuffle.html pon., 1 mar 2021 o 13:01 yidan zhao <hinobl...@gmail.com> napisaĆ(a): > I have a job which includes about 50+ tasks. I want to split it to > multiple jobs, and the data is transferred through Kafka, but how about > watermark? > > Is anyone have do something similar and solved this problem? > > Here I give an example: > The original job: kafkaStream1(src-topic) => xxxProcess => xxxWindow1 ==> > xxxWindow2 resultSinkToKafka(result-topic). > > The new job1: kafkaStream1(src-topic) => xxxProcess => xxxWindow1 ==> > resultSinkToKafka(mid-topic). > The new job2: kafkaStream1(mid-topic) => xxxWindow2 ==> > resultSinkToKafka(result-topic). > > The watermark for window1 and window 2 is separated to two jobs, which > also seems to be working, but this introduces a 5-minute delay for window2 > (both window is 5min's cycle). > > The key problem is that the window's cycle is 5min, so the window2 will > have a 5min's delay. > If watermark can be transferred between jobs, it is not a problem anymore. > >