Flink defines bundles in terms of number of elements and processing
time, by default 1000 elements or 1000 milliseconds, whatever happens
first. But bundles are not a "natural" concept in Flink, it uses them
merely to comply with the Beam model. By default, checkpoints are
unaligned with bundles.
Jan
On 9/22/23 01:58, Robert Bradshaw via dev wrote:
Dataflow uses a work-stealing protocol. The FnAPI has a protocol to
ask the worker to stop at a certain element that has already been sent.
On Thu, Sep 21, 2023 at 4:24 PM Joey Tran <joey.t...@schrodinger.com>
wrote:
Writing a runner and the first strategy for determining bundling
size was to just start with a bundle size of one and double it
until we reach a size that we expect to take some targets
per-bundle runtime (e.g. maybe 10 minutes). I realize that this
isn't the greatest strategy for high sized cost transforms. I'm
curious what kind of strategies other runners take?