Hi All, I'm building a pipeline to process events as they come and do not really care about the event time and watermark. I'm more interested in not discarding the events and reducing the latency. The downstream pipeline has a stateful DoFn. I understand that the default window strategy is Global Windows,. I did not completely understand the default trigger as per https://beam.apache.org/releases/javadoc/2.23.0/org/apache/beam/sdk/transforms/windowing/DefaultTrigger.html it says Repeatedly.forever(AfterWatermark.pastEndOfWindow()), In case of global window how does this work (there is no end of window)?.
My source is Google PubSub and pipeline is running on Dataflow runner I have defined my window transform as below input.apply(TRANSFORM_NAME, Window.<T>into(new GlobalWindows()) > .triggering(Repeatedly.forever(AfterPane.elementCountAtLeast(1))) > .withAllowedLateness(Duration.standardDays(1)).discardingFiredPanes()) A couple of questions 1. Is triggering after each element inefficient in terms of persistence(serialization) after each element and also parallelism triggering after each looks like a serial execution? 2. How does Dataflow parallelize in such cases of triggers? Thanks and appreciate the responses. Kishore