Re: SplittableDoFn-based source doesn't efficiently scale up in Dataflow

2022-05-09 Thread Claire McGinty
Can you clarify a bit what you mean by being over-aggressive in the splitRestriction? We can't go any smaller as far as the unit of splittability (a single row group). Thanks! -Claire On Tue, May 3, 2022 at 9:14 PM Robert Bradshaw wrote: > On Tue, May 3, 2022 at 10:39 AM Claire McGinty > wrote

SplittableDoFn-based source doesn't efficiently scale up in Dataflow

2022-05-03 Thread Claire McGinty
Hi Beam users, I'm looking for input on one of our IOs that we recently migrated to SplittableDoFn. When running in Dataflow we saw performance gains in every aspect (VCPU hours, total memory time) except for total elapsed time: the SplittableDoFn implem