If you create a Pub/Sub subscription before you start the Beam pipeline,
the subscription will capture all of those messages (as long as you start
the Beam pipeline within 7 days). You can then start the Beam pipeline
against that subscription, which should do what you want.

On Fri, May 16, 2025 at 3:37 PM Jonathan Hope <
jonathan.douglas.h...@gmail.com> wrote:

> I have a use case where I would like to start publishing messages to a
> PubSub topic in advance of starting a streaming pipeline that consumes
> those messages. When the pipeline starts I would like it to "catch up"
> which is to say to read all of the messages that were published even before
> it started. In this case event time doesn't matter to me. All that matters
> is that I see every message once.
>
> My initial thought was something like this:
>
> ```
> trg := trigger.AfterAny([]trigger.Trigger{
>   trigger.Repeat(trigger.AfterProcessingTime().PlusDelay(time.Minute * 1)),
>   trigger.AfterCount(1_000),
> })
>
> beam.WindowInto(s, window.NewGlobalWindows(), x, beam.Trigger(trg),
> beam.PanesDiscard())
> ```
>
> Basically process a minutes worth OR a 1000 messages. This definitely
> doesn't work though. Later on I call `databaseio.Write`, and that never
> does anything. However if I publish a message _while_ the pipeline is
> running `databaseio.Write` seems to pick it up. I will freely admit I don't
> have much of an idea what I'm doing here.
>

Reply via email to