Hi,

Can not you write the watermark as a special event to the "mid-topic"? In
the "new job2" you would parse this event and use it to assign watermark
before `xxxWindow2`? I believe this is what FlinkKafkaShuffle is doing [1],
you could look at its code for inspiration.

Piotrek

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.12/api/java/org/apache/flink/streaming/connectors/kafka/shuffle/FlinkKafkaShuffle.html

pon., 1 mar 2021 o 13:01 yidan zhao <hinobl...@gmail.com> napisaƂ(a):

> I have a job which includes about 50+ tasks. I want to split it to
> multiple jobs, and the data is transferred through Kafka, but how about
> watermark?
>
> Is anyone have do something similar and solved this problem?
>
> Here I give an example:
> The original job: kafkaStream1(src-topic) => xxxProcess => xxxWindow1 ==>
> xxxWindow2 resultSinkToKafka(result-topic).
>
> The new job1: kafkaStream1(src-topic) => xxxProcess => xxxWindow1 ==>
> resultSinkToKafka(mid-topic).
> The new job2: kafkaStream1(mid-topic) => xxxWindow2 ==>
> resultSinkToKafka(result-topic).
>
> The watermark for window1 and window 2 is separated to two jobs, which
> also seems to be working, but this introduces a 5-minute delay for window2
> (both window is 5min's cycle).
>
> The key problem is that the window's cycle is 5min, so the window2 will
> have a 5min's delay.
> If watermark can be transferred between jobs, it is not a problem anymore.
>
>

Reply via email to