The same execution engine that powers Dataflow also powers Flume [1] which is the workhorse for data processing inside Google and is regularly used with orders of magnitude more workers than the limits mentioned at [2]. I don't know what the theoretical maximum is, but it's far above what's available in GCE.
[1] https://research.google/pubs/flumejava-easy-efficient-data-parallel-pipelines/ [2] https://docs.cloud.google.com/dataflow/quotas#limits On Fri, Jan 30, 2026 at 7:47 AM Danny McCormick via dev <[email protected]> wrote: > > https://docs.cloud.google.com/dataflow/quotas has some system limits (which > vary depending on batch vs streaming). This is also impacted by the project's > total Compute Engine quota. I'm not sure how hard of a limit the worker limit > is or why it is set where it is (others may know more than me). > > As you can imagine, this problem doesn't come up often. The more common > problem is that Dataflow can't scale up further without degrading the IO it > communicates with. > > Thanks, > Danny > > On Fri, Jan 30, 2026 at 9:47 AM Joey Tran <[email protected]> wrote: >> >> Out of curiosity, is there a technical max number of workers that the >> DataflowRunner can run? Or is there no theoretical limit, so long as you >> horizontally scale the runner node?
