> > Is there a reason you can't process the rows in parallel? > Yes, but only up to a very low limit. In our current solution (not using Beam) this is 4. The workers are doing data sync over a WAN with very high latency and relatively low bandwidth. We limit the number of workers (and thus number of rows processed in parallel) in order to ensure that high priority rows finish quickly and before low priority rows. If the limit were too high, the available bandwidth would be spread over too many tasks -- both low and high priority -- and thus increase the time to complete any given task.
Can you pause inflight work on a row arbitrarily or does whatever work that > you start must complete? > That is not a requirement, but it could be an interesting feature if done correctly. -chad