Hi Raman,
can you share the details of the pipeline? How exactly are you using the
looping timer? Timer as described in the linked blog post should be
deterministic even when the order of the input elements is undefined.
Does you logic depend on element ordering?
Jan
On 1/12/21 3:18 PM, Raman Gupta wrote:
Hello, I am building and testing a pipeline with the direct runner.
The pipeline includes a looping timer -
https://beam.apache.org/blog/looping-timers/.
For now, I am using JdbcIO to obtain my input data, though when put
into production the pipeline will use PubSubIO.
I am finding that the looping timer begins producing outputs at a
random event time, which makes some sense given the randomization of
inputs in the direct runner. However, this makes the results of
executing my pipeline with the direct runner completely non-deterministic.
So:
1) Is there a way to turn off this non-deterministic behavior, but
just for the GlobalWindow / LoopingTimer?
2) Perhaps alternatively, is there a way to "initialize" the looping
timer when the pipeline starts, rather than when it first sees an
element? Perhaps a side input?
3) Am I right in assuming that when I move this pipeline to pub/sub io
and operate it in streaming mode, this issue will go away?
Thanks!
Raman