Hi Raman,

can you share the details of the pipeline? How exactly are you using the looping timer? Timer as described in the linked blog post should be deterministic even when the order of the input elements is undefined. Does you logic depend on element ordering?

 Jan

On 1/12/21 3:18 PM, Raman Gupta wrote:
Hello, I am building and testing a pipeline with the direct runner. The pipeline includes a looping timer - https://beam.apache.org/blog/looping-timers/.

For now, I am using JdbcIO to obtain my input data, though when put into production the pipeline will use PubSubIO.

I am finding that the looping timer begins producing outputs at a random event time, which makes some sense given the randomization of inputs in the direct runner. However, this makes the results of executing my pipeline with the direct runner completely non-deterministic.

So:

1) Is there a way to turn off this non-deterministic behavior, but just for the GlobalWindow / LoopingTimer?

2) Perhaps alternatively, is there a way to "initialize" the looping timer when the pipeline starts, rather than when it first sees an element? Perhaps a side input?

3) Am I right in assuming that when I move this pipeline to pub/sub io and operate it in streaming mode, this issue will go away?

Thanks!
Raman

Reply via email to