Actually, I just put this process function there for debugging purposes. My main goal is to join the E & C using the Temporal Table function, but I have observed exactly the same behavior i.e. when the parallelism was > 1 there was no output and when I was setting it to 1 then the output was generated. So, I have switched to process function to see whether the watermarks are reaching this stage.
Best Regards, Dom. pon., 16 mar 2020 o 19:46 Theo Diefenthal <theo.diefent...@scoop-software.de> napisał(a): > Hi Dominik, > > I had the same once with a custom processfunction. My processfunction > buffered the data for a while and then output it again. As the proces > function can do anything with the data (transforming, buffering, > aggregating...), I think it's just not safe for flink to reason about the > watermark of the output. > > I solved all my issues by calling `assignTimestampsAndWatermarks` directly > post to the (co-)process function. > > Best regards > Theo > > ------------------------------ > *Von: *"Dominik Wosiński" <wos...@gmail.com> > *An: *"user" <user@flink.apache.org> > *Gesendet: *Montag, 16. März 2020 16:55:18 > *Betreff: *Issues with Watermark generation after join > > Hey, > I have noticed a weird behavior with a job that I am currently working on. > I have 4 different streams from Kafka, lets call them A, B, C and D. Now > the idea is that first I do SQL Join of A & B based on some field, then I > create append stream from Joined A&B, let's call it E. Then I need to > assign timestamps to E since it is a result of joining and Flink can't > figure out the timestamps. > > Next, I union E & C, to create some F stream. Then finally I connect E & C > using `keyBy` and CoProcessFunction. Now the issue I am facing is that if I > try to, it works fine if I enforce the parallelism of E to be 1 by invoking > *setParallelism*. But if parallelism is higher than 1, for the same data > - the watermark is not progressing correctly. I can see that > *CoProcessFunction > *methods are invoked and that data is produced, but the Watermark is > never progressing for this function. What I can see is that watermark is > always equal to (0 - allowedOutOfOrderness). I can see that timestamps are > correctly extracted and when I add debug prints I can actually see that > Watermarks are generated for all streams, but for some reason, if the > parallelism is > 1 they will never progress up to connect function. Is > there anything that needs to be done after SQL joins that I don't know of > ?? > > Best Regards, > Dom. >