If you create a Pub/Sub subscription before you start the Beam pipeline, the subscription will capture all of those messages (as long as you start the Beam pipeline within 7 days). You can then start the Beam pipeline against that subscription, which should do what you want.
On Fri, May 16, 2025 at 3:37 PM Jonathan Hope < jonathan.douglas.h...@gmail.com> wrote: > I have a use case where I would like to start publishing messages to a > PubSub topic in advance of starting a streaming pipeline that consumes > those messages. When the pipeline starts I would like it to "catch up" > which is to say to read all of the messages that were published even before > it started. In this case event time doesn't matter to me. All that matters > is that I see every message once. > > My initial thought was something like this: > > ``` > trg := trigger.AfterAny([]trigger.Trigger{ > trigger.Repeat(trigger.AfterProcessingTime().PlusDelay(time.Minute * 1)), > trigger.AfterCount(1_000), > }) > > beam.WindowInto(s, window.NewGlobalWindows(), x, beam.Trigger(trg), > beam.PanesDiscard()) > ``` > > Basically process a minutes worth OR a 1000 messages. This definitely > doesn't work though. Later on I call `databaseio.Write`, and that never > does anything. However if I publish a message _while_ the pipeline is > running `databaseio.Write` seems to pick it up. I will freely admit I don't > have much of an idea what I'm doing here. >