You have it basically right. However, there are a couple minor
clarifications:

1. A particular window on the side input is not "ready" until there has
been some element output to it (or it has expired, which will make it the
default value). Main input elements will wait for the side input to be
ready. If you configure triggering on the side input, then the first
triggering will make it "ready". Of course, this means that the value you
will read will be incomplete view of the data. If you have a 24 hour window
with triggering set up then the value that is read will be whatever the
most recent trigger is, but with some caching delay.
2. None of the "time" that you are talking about is real time. It is all
event time so it is controlled by the side input and main input watermarks.
Of course in streaming these are usually close to real time so yes on
average what you describe is probably right.

It sounds like you want a side input with a trigger on it, if you want to
read it before you have all the data. This is highly nondeterministic so
you want to be sure that you do not require exact answers on the side input.

Kenn

On Tue, Jan 5, 2021 at 6:56 AM Manninger, Matyas <
matyas.mannin...@veolia.com> wrote:

> Dear Beam users,
>
> Can someone clarify me how side input works in streaming? If I use a
> stream as a side input to my main stream, each element will be paired with
> a side input from the according time window. does this mean that the
> element will not be processed until the appropriate window on the side
> input stream is closed? So if my side input is windowed into 24 hour
> windows will my elements from the main stream be processed only every 24
> hour? If not, then if the window is triggered for the sideinput at 12:00
> and the input actually only arrives at 12:05 then all elements from the
> main stream processed between 12:00 and 12:05 will be matched with an empty
> sideinput?
>
> Any clarification is appreciated.
>
> Best regards,
> Matyas
>

Reply via email to