A runner is free to process things in streaming mode, batch mode, or
even alternate between the two. Generally there are certain
efficiencies/simplifications that only work (well) in batch mode, and
on the other hand the presence of an unbounded source means one cannot
wait for a PCollection to be
A question for the runner implementers:
The Beam model is stream vs batch agnostic. But I have use cases where we
replay history (from BigTable or BigQuery) but then transition into
streaming.
Now with Splittable DoFn's it's easier to create inputs that start batch,
then go streaming. But I have