Flink defines bundles in terms of number of elements and processing
time, by default 1000 elements or 1000 milliseconds, whatever happens
first. But bundles are not a "natural" concept in Flink, it uses them
merely to comply with the Beam model. By default, checkpoints are
unaligned with bundle
This is your daily summary of Beam's current high priority issues that may need
attention.
See https://beam.apache.org/contribute/issue-priorities for the meaning and
expectations around issue priorities.
Unassigned P1 Issues:
https://github.com/apache/beam/issues/28383 [Failing Test]:
or
What is the best way to amortize heavy operations across elements in Flink?
(that is what bundles are for, basically)
On Fri, Sep 22, 2023 at 5:09 AM Jan Lukavský wrote:
> Flink defines bundles in terms of number of elements and processing time,
> by default 1000 elements or 1000 milliseconds, w
Whoops, I typoed my last email. I meant to write "this isn't the
greatest strategy for high *fixed* cost transforms", e.g. a transform that
takes 5 minutes to get set up and then maybe a microsecond per input
I suppose one solution is to move the responsibility for handling this kind
of situation
(I notice that you replied only to yourself, but there has been a whole
thread of discussion on this - are you subscribed to dev@beam?
https://lists.apache.org/thread/k81fq301ypwmjowknzyqq2qc63844rbd)
It sounds like you want what everyone wants: to have the biggest bundles
possible.
So for bounde
I've actually wondered about this specifically for streaming... if you're
writing a pipeline there it seems like you're often going to want to put
high fixed cost things like database connections even outside of the bundle
setup. You really only want to do that once in the lifetime of the worker
it
Ah! Thanks for that catch. I had subscribed to the user mailing list but
forgot to ever sub to the dev list
On Fri, Sep 22, 2023 at 10:03 AM Kenneth Knowles wrote:
> (I notice that you replied only to yourself, but there has been a whole
> thread of discussion on this - are you subscribed to dev
I feel like that's actually pretty easy with Github actions? I think maybe
there's even one that exists Github Pages and probably any other static
site generator thingy we could care to name. Related, I stumbled across
this the other day: https://github.com/apache/beam-site which appears to be
unus
> I do feel strongly that https://beam.apache.org/contribute/ should remain
on the main site, as it's aimed at users (who hopefully want to step up and
contribute)
To be clear, I don't think anyone is suggesting getting rid of the section,
my comments were about replacing the side panel links with
Flink operators are long-running classes with life-cycle of open() and
close(), so any amortization can be done between those methods, see [1].
Essentially, it could be viewed that in vanilla Flink the complete
(unbounded) input is single "bundle". The crucial point is that state is
checkpointe
On Fri, Sep 22, 2023 at 7:23 AM Byron Ellis via dev
wrote:
> I've actually wondered about this specifically for streaming... if you're
> writing a pipeline there it seems like you're often going to want to put
> high fixed cost things like database connections even outside of the bundle
> setup.
On Fri, Sep 22, 2023 at 8:05 AM Danny McCormick via dev
wrote:
> > I do feel strongly that https://beam.apache.org/contribute/ should
> remain on the main site, as it's aimed at users (who hopefully want to step
> up and contribute)
>
> To be clear, I don't think anyone is suggesting getting rid
On 9/22/23 18:07, Robert Bradshaw via dev wrote:
On Fri, Sep 22, 2023 at 7:23 AM Byron Ellis via dev
wrote:
I've actually wondered about this specifically for streaming... if
you're writing a pipeline there it seems like you're often going
to want to put high fixed cost things lik
On Fri, Sep 22, 2023 at 10:58 AM Jan Lukavský wrote:
> On 9/22/23 18:07, Robert Bradshaw via dev wrote:
>
> On Fri, Sep 22, 2023 at 7:23 AM Byron Ellis via dev
> wrote:
>
>> I've actually wondered about this specifically for streaming... if you're
>> writing a pipeline there it seems like you're
14 matches
Mail list logo