>
> Is there a reason you can't process the rows in parallel?
>

Yes, but only up to a very low limit.  In our current solution (not using
Beam) this is 4.  The workers are doing data sync over a WAN with very high
latency and relatively low bandwidth.  We limit the number of workers (and
thus number of rows processed in parallel) in order to ensure that high
priority rows finish quickly and before low priority rows.  If the limit
were too high, the available bandwidth would be spread over too many tasks
-- both low and high priority -- and thus increase the time to complete any
given task.

Can you pause inflight work on a row arbitrarily or does whatever work that
> you start must complete?
>

That is not a requirement, but it could be an interesting feature if done
correctly.

-chad

Reply via email to