Beam will block on side inputs until at least one value is available (or the watermark has advanced such that we can be sure one will never become available, which doesn't really apply to the global window case). After that, workers generally cache the side input value (for performance reasons) but may periodically re-fetch it (the exact cadence probably depends on the runner implementation).
On Tue, Sep 12, 2023 at 10:34 PM Ruben Vargas <ruben.var...@metova.com> wrote: > Hello Everyone > > I have a question, I have on my pipeline one side input that fetches some > configurations from an API endpoint each 30 seconds, my question is this. > > > I have something similar to what is showed in the side input patterns > documentation > > PCollectionView<Map<String, String>> map = > p.apply(GenerateSequence.from(0).withRate(1, > Duration.standardSeconds(5L))) > .apply( > ParDo.of( > new DoFn<Long, Map<String, String>>() { > > @ProcessElement > public void process( > @Element Long input, > @Timestamp Instant timestamp, > OutputReceiver<Map<String, String>> o) { > call HTTP endpoint here!! > } > })) > .apply( > Window.<Map<String, String>>into(new GlobalWindows()) > > .triggering(Repeatedly.forever(AfterProcessingTime.pastFirstElementInPane())) > .discardingFiredPanes()) > .apply(Latest.globally()) > .apply(View.asSingleton()); > > What happens if for example the HTTP endpoint takes time to respond due > some network issues and/or the amount of data. Is this gonna introduce > delays on my main pipeline? Is the main pipeline blocked until the pardo > that processes the side input ends? > > I don't care too much about the consistency here, I mean if the > configuration changed in the Time T1 I don't care if some registries with > T2 timestamp are processed with the configuration version of T1. > > > Regards. > >