Hi Vamsi,
short answer is - it depends. :)
There are many unknowns in your question. First of all - what kind of
logic do you refer to? Does it need to modify (i.e. join with) the
incoming data? Or is it just some (volatile) monitoring?
If you need timers for output data consistency - then yes, under current
Flink Runner implementation there will necessarily be a shuffle.
If you don't worry about consistency, but only need a "rough inexact
estimates" then using pure stateless DoFn without @Timer can do the trick.
Can you provide more details on your use case?
Jan
On 5/20/26 14:29, vamsikrishna korada wrote:
Hi Beam Community,
I’m reaching out for some guidance on a Beam Flink streaming job I’m
working on.
We are reading from a Kafka topic, where the traffic can be either
sparse or high-volume, and we need to run a piece of logic
periodically, roughly every 5 minutes.
We considered using |@Timer|, but based on the Beam docs, timers
require keyed state, which introduces a shuffle. We would like to
avoid this shuffle if possible.
Is there a way to trigger periodic logic in a Beam pipeline without
causing a data shuffle?
Thanks,
Vamsi