Hi Kristoff, synchronization across operators is not easy to achieve.
If one needs to wait until a sink has processed some element, it requires that a sink participates in the pipeline. So it is not located as a "leaf" operator but location somewhere in the middle.
So your idea to call MQ directly in processFunction1 sounds like a reasonable solution to me. Maybe it is possible to wrap the original code somehow. It could require to go one level deeper in the DataStream API (using a custom stream transformation and operator instead of ProcessFunction).
Another idea that comes to my mind is that you use the checkpoint barrier as a synchronization tool. I'm not familiar how the MQ sink works, but if you can ensure that the side output is written out in the next checkpoint. You could leverage an interface like `org.apache.flink.runtime.state.CheckpointListener`.
I hope others might come up with a better idea. Regards, Timo On 14.04.20 23:59, KristoffSC wrote:
Hi all, I have a special use case that I'm not sure how I can fulfill. The use case is: I have my main business processing pipe line that has a MQ source, processFunction1, processFunction2 and MQ sink PocessFunction1 apart from processing the main business message is also emitting some side effects using side outputs. Those side outputs are send to SideOutputMqSink that sends them to the queue. The requirement is that PocessFunction1 must not send out the main business message further to processFunction2 until side output from processFunction1 is send to the queue via SideOutputMqSink. In general I don't have to use side outputs, although I do some extra processing on them before sending to the sink so having sideOutput stream is nice to have. Never the less, the key requirement is that we should wait with further processing until side utput is send to the queue. I could achieve it in a way that my processFunction1 in processElement method will call MQ directly before sending out the main message, although I dot like that idea. I was thinking is there a way to have a Sink function that would be also a FlatMap function? The best solution would be to be able to process two streams (main and side effect) in some nice way but with some barrier, so the main pipeline will wait until side output is send. Both streams can be keyed. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/