The default trigger will only fire when the global window closes which does
happen with sources whose watermark goes > GlobalWindow.MAX_TIMESTAMP or
during pipeline drain with partial results in streaming. Bounded sources
commonly have their watermark advance to the end of time when they complete
and some unbounded sources can stop producing output if they detect the end.

Parallelization for stateful DoFns are per key and window. Parallelization
for GBK is per key and window pane. Note that  elementCountAtLeast means
that the runner can buffer as many as it wants and can decide to offer a
low latency pipeline by triggering often or better throughput through the
use of buffering.



On Mon, Oct 12, 2020 at 8:22 AM KV 59 <kvajjal...@gmail.com> wrote:

> Hi All,
>
> I'm building a pipeline to process events as they come and do not really
> care about the event time and watermark. I'm more interested in not
> discarding the events and reducing the latency. The downstream pipeline has
> a stateful DoFn. I understand that the default window strategy is Global
> Windows,. I did not completely understand the default trigger as per
>
> https://beam.apache.org/releases/javadoc/2.23.0/org/apache/beam/sdk/transforms/windowing/DefaultTrigger.html
> it says Repeatedly.forever(AfterWatermark.pastEndOfWindow()), In case of
> global window how does this work (there is no end of window)?.
>
> My source is Google PubSub and pipeline is running on Dataflow runner I
> have defined my window transform as below
>
> input.apply(TRANSFORM_NAME, Window.<T>into(new GlobalWindows())
>> .triggering(Repeatedly.forever(AfterPane.elementCountAtLeast(1)))
>> .withAllowedLateness(Duration.standardDays(1)).discardingFiredPanes())
>
>
> A couple of questions
>
>    1. Is triggering after each element inefficient in terms of
>    persistence(serialization) after each element and also parallelism
>    triggering after each looks like a serial execution?
>    2. How does Dataflow parallelize in such cases of triggers?
>
>
> Thanks and appreciate the responses.
>
> Kishore
>

Reply via email to