Actually, I just put this process function there for debugging purposes. My
main goal is to join the E & C using the Temporal Table function, but I
have observed exactly the same behavior i.e. when the parallelism was > 1
there was no output and when I was setting it to 1 then the output was
generated. So, I have switched to process function to see whether the
watermarks are reaching this stage.

Best Regards,
Dom.

pon., 16 mar 2020 o 19:46 Theo Diefenthal <theo.diefent...@scoop-software.de>
napisał(a):

> Hi Dominik,
>
> I had the same once with a custom processfunction. My processfunction
> buffered the data for a while and then output it again. As the proces
> function can do anything with the data (transforming, buffering,
> aggregating...), I think it's just not safe for flink to reason about the
> watermark of the output.
>
> I solved all my issues by calling `assignTimestampsAndWatermarks` directly
> post to the (co-)process function.
>
> Best regards
> Theo
>
> ------------------------------
> *Von: *"Dominik Wosiński" <wos...@gmail.com>
> *An: *"user" <user@flink.apache.org>
> *Gesendet: *Montag, 16. März 2020 16:55:18
> *Betreff: *Issues with Watermark generation after join
>
> Hey,
> I have noticed a weird behavior with a job that I am currently working on.
> I have 4 different streams from Kafka, lets call them A, B, C and D. Now
> the idea is that first I do SQL Join of A & B based on some field, then I
> create append stream from Joined A&B, let's call it E. Then I need to
> assign timestamps to E since it is a result of joining and Flink can't
> figure out the timestamps.
>
> Next, I union E & C, to create some F stream. Then finally I connect E & C
> using `keyBy` and CoProcessFunction. Now the issue I am facing is that if I
> try to, it works fine if I enforce the parallelism of E to be 1 by invoking
> *setParallelism*. But if parallelism is higher than 1, for the same data
> - the watermark is not progressing correctly. I can see that 
> *CoProcessFunction
> *methods are invoked and that data is produced, but the Watermark is
> never progressing for this function. What I can see is that watermark is
> always equal to (0 - allowedOutOfOrderness). I can see that timestamps are
> correctly extracted and when I add debug prints I can actually see that
> Watermarks are generated for all streams, but for some reason, if the
> parallelism is > 1 they will never progress up to connect function. Is
> there anything that needs to be done after SQL joins that I don't know of
> ??
>
> Best Regards,
> Dom.
>

Reply via email to