Re: [DISCUSS] FLIP-141: Intra-Slot Managed Memory Sharing

Xintong Song Wed, 02 Sep 2020 00:39:57 -0700

Thanks all for the feedback and discussion.

I have updated the FLIP, with the following changes.


   - Choose the main proposal over the alternative approach
   - Combine weights of RocksDB and batch operators
   - Expose weights through configuration options, rather than via
   ExecutionConfig.
   - Add implementation plan.

Please help take another look.

Thank you~

Xintong Song



On Wed, Sep 2, 2020 at 2:41 PM Xintong Song <tonysong...@gmail.com> wrote:

> Thanks for the inputs, Aljoscha & Till.
>
>
> # Weight Configuration
>
>
> I think exposing the knobs incrementally is a good idea. However, I'm not
> sure about non-configurable as the first step.
>
>
> Currently, users can tune memory for rocksdb
> ('taskmanager.memory.managed.size') and python
> ('python.fn-execution.[framework|buffer].memory.size') separately, which
> practically means any combination of rocksdb and python memory sizes. If we
> switch to non-configurable weights, that will be a regression compared to
> 1.11.
>
>
> Therefore, I think exposing via configuration options might be a good
> first step. And we can discuss exposing via ExecutionConfig if later we see
> that requirement.
>
>
> # Naming of Weights
>
>
> I'm neutral for "Flink/Internal memory".
>
>
> I think the reason we can combine weights for batch algorithms and state
> backends is that they are never mixed together. My only concern
> for "Flink/Internal memory", which might not be a problem at the moment, is
> that what if new memory use cases appear in the future, which can also be
> described by "Flink/Internal memory" but is not guaranteed not mixed with
> batch algorithms or state backends?
>
>
> Anyway, I think the naming should not block this FLIP, as long as we have
> consensus on combining the two weights for rocksdb and batch algorithms. We
> can keep the naming discussion open until the implementation phase.
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Tue, Sep 1, 2020 at 10:19 PM Till Rohrmann <trohrm...@apache.org>
> wrote:
>
>> Thanks for creating this FLIP Xintong.
>>
>> I agree with the previous comments that the memory configuration should be
>> as easy as possible. Every new knob has the potential to confuse users
>> and/or allows him to shoot himself in the foot. Consequently, I am +1 for
>> the first proposal in the FLIP since it is simpler.
>>
>> Also +1 for Stephan's proposal to combine batch operator's and
>> RocksDB's memory usage into one weight.
>>
>> Concerning the names for the two weights, I fear that we are facing one of
>> the two hard things in computer science. To add another idea, we could
>> name
>> them "Flink memory"/"Internal memory" and "Python memory".
>>
>> For the sake of making the scope of the FLIP as small as possible and to
>> develop the feature incrementally, I think that Aljoscha's proposal to
>> make
>> it non-configurable for the first step sounds like a good idea. As a next
>> step (and also if we see need), we can make the memory weights
>> configurable
>> via the configuration. And last, we could expose it via the
>> ExecutionConfig
>> if it is required.
>>
>> Cheers,
>> Till
>>
>> On Tue, Sep 1, 2020 at 2:24 PM Aljoscha Krettek <aljos...@apache.org>
>> wrote:
>>
>> > Hi,
>> >
>> > playing devils advocate here: should we even make the memory weights
>> > configurable? We could go with weights that should make sense for most
>> > cases in the first version and only introduce configurable weights when
>> > (if) users need them.
>> >
>> > Regarding where/how things are configured, I think that most things
>> > should be a ConfigOption first (Thanks cc'in me, Stephan!). This makes
>> > them configurable via flink-conf.yaml and via command line parameters,
>> > for example "bin/flink run -D memory.foo=bla ...". We can think about
>> > offering programmatic API for cases where it makes sense, of course.
>> >
>> > Regarding naming one of the configurable weights
>> > "StateBackend-BatchAlgorithm". I think it's not a good idea to be that
>> > specific because the option will not age well. For example when we want
>> > to change which group of memory consumers are configured together or
>> > when we add something new.
>> >
>> > Best,
>> > Aljoscha
>> >
>> > On 31.08.20 08:13, Xintong Song wrote:
>> > > Thanks for the feedbacks, @Stephan
>> > >
>> > >
>> > >    - There is a push to make as much as possible configurable via the
>> > main
>> > >> configuration, and not only in code. Specifically values for
>> operations
>> > and
>> > >> tuning.
>> > >>      I think it would be more important to have such memory weights
>> in
>> > the
>> > >> config, compared to in the program API. /cc Aljoscha
>> > >
>> > >
>> > > I can see the benefit that having memory weights in the main
>> > configuration
>> > > makes tuning easier, which makes great sense to me. On the other hand,
>> > what
>> > > we lose is the flexibility to have different weights for jobs running
>> in
>> > > the same Flink cluster. It seems to me the problem is that we don't
>> have
>> > an
>> > > easy way to overwrite job-specific configurations without touching the
>> > > codes.
>> > >
>> > >
>> > > Given the current status, what if we make the memory weights
>> configurable
>> > > through both the main configuration and the programming API? The main
>> > > configuration should take effect iff the weights are not explicitly
>> > > specified through the programming API. In this way, job cluster users
>> can
>> > > easily tune the weight through the main configuration, while session
>> > > cluster users, if they want to have different weights for jobs, can
>> still
>> > > overwrite the weight through execution configs.
>> > >
>> > >
>> > >    - My recommendation would be to keep this as simple as possible.
>> This
>> > >> will make a lot of configuration code harder, and make it harder for
>> > users
>> > >> to understand Flink's memory model.
>> > >>      Making things as easy for users to understand is very important
>> in
>> > my
>> > >> opinion. In that regard, the main proposal in the FLIP seems better
>> than
>> > >> the alternative proposal listed at the end of the FLIP page.
>> > >
>> > > +1 from my side.
>> > >
>> > >
>> > >    - For the simplicity, we could go even further and simply have two
>> > memory
>> > >> users at the moment: The operator algorithm/data-structure and the
>> > external
>> > >> language process (Python for now).
>> > >>      We never have batch algos and RocksDB mixed, having this as
>> > separate
>> > >> options is confusing as it suggests this can be combined
>> arbitrarily. I
>> > >> also think that a slim possibility that we may ever combine this in
>> the
>> > >> future is not enough reason to make it more complex/confusing.
>> > >
>> > >
>> > > Good point. +1 for combining batch/rocksdb weights, for they're never
>> > mixed
>> > > together. We can even just name it "StateBackend-BatchAlgorithm" for
>> > > explicitly.
>> > >
>> > >
>> > > For "external language process", I'm not entirely sure. Future
>> external
>> > > languages are possibly mixed with python processes. To avoid later
>> > > considering how to share external language memory across different
>> > > languages, I would suggest to present the concept as "python memory"
>> > rather
>> > > than "external language process memory".
>> > >
>> > >
>> > > Thank you~
>> > >
>> > > Xintong Song
>> > >
>> > >
>> > >
>> > > On Sun, Aug 30, 2020 at 10:19 PM Stephan Ewen <se...@apache.org>
>> wrote:
>> > >
>> > >> Thanks for driving this proposal. A few thoughts on the current
>> design:
>> > >>
>> > >>    - There is a push to make as much as possible configurable via the
>> > main
>> > >> configuration, and not only in code. Specifically values for
>> operations
>> > and
>> > >> tuning.
>> > >>      I think it would be more important to have such memory weights
>> in
>> > the
>> > >> config, compared to in the program API. /cc Aljoscha
>> > >>
>> > >>    - My recommendation would be to keep this as simple as possible.
>> This
>> > >> will make a lot of configuration code harder, and make it harder for
>> > users
>> > >> to understand Flink's memory model.
>> > >>      Making things as easy for users to understand is very important
>> in
>> > my
>> > >> opinion. In that regard, the main proposal in the FLIP seems better
>> than
>> > >> the alternative proposal listed at the end of the FLIP page.
>> > >>
>> > >>    - For the simplicity, we could go even further and simply have two
>> > memory
>> > >> users at the moment: The operator algorithm/data-structure and the
>> > external
>> > >> language process (Python for now).
>> > >>      We never have batch algos and RocksDB mixed, having this as
>> > separate
>> > >> options is confusing as it suggests this can be combined
>> arbitrarily. I
>> > >> also think that a slim possibility that we may ever combine this in
>> the
>> > >> future is not enough reason to make it more complex/confusing.
>> > >>
>> > >>    - I am also not aware of any plans to combine the network and
>> > operator
>> > >> memory. Not that it would be infeasible to do this, but I think this
>> > would
>> > >> also be orthogonal to this change, and I am not sure this would be
>> > solved
>> > >> with static weights. So trying to get network memory into this
>> proposal
>> > >> seems pre-mature to me.
>> > >>
>> > >> Best,
>> > >> Stephan
>> > >>
>> > >>
>> > >> On Fri, Aug 28, 2020 at 10:48 AM Xintong Song <tonysong...@gmail.com
>> >
>> > >> wrote:
>> > >>
>> > >>>>
>> > >>>> A quick question, does network memory treated as managed memory
>> now?
>> > Or
>> > >>> in
>> > >>>> the future?
>> > >>>>
>> > >>> No, network memory is independent from managed memory ATM. And I'm
>> not
>> > >>> aware of any plan to combine these two.
>> > >>>
>> > >>> Any insights there?
>> > >>>
>> > >>> Thank you~
>> > >>>
>> > >>> Xintong Song
>> > >>>
>> > >>>
>> > >>>
>> > >>> On Fri, Aug 28, 2020 at 4:35 PM Kurt Young <ykt...@gmail.com>
>> wrote:
>> > >>>
>> > >>>> A quick question, does network memory treated as managed memory
>> now?
>> > Or
>> > >>> in
>> > >>>> the future?
>> > >>>>
>> > >>>> Best,
>> > >>>> Kurt
>> > >>>>
>> > >>>>
>> > >>>> On Wed, Aug 26, 2020 at 5:32 PM Xintong Song <
>> tonysong...@gmail.com>
>> > >>>> wrote:
>> > >>>>
>> > >>>>> Hi devs,
>> > >>>>>
>> > >>>>> I'd like to bring the discussion over FLIP-141[1], which proposes
>> how
>> > >>>>> managed memory should be shared by various use cases within a
>> slot.
>> > >>> This
>> > >>>> is
>> > >>>>> an extension to FLIP-53[2], where we assumed that RocksDB state
>> > >> backend
>> > >>>> and
>> > >>>>> batch operators are the only use cases of managed memory for
>> > >> streaming
>> > >>>> and
>> > >>>>> batch jobs respectively, which is no longer true with the
>> > >> introduction
>> > >>> of
>> > >>>>> Python UDFs.
>> > >>>>>
>> > >>>>> Please notice that we have not reached consensus between two
>> > >> different
>> > >>>>> designs. The major part of this FLIP describes one of the
>> candidates,
>> > >>>> while
>> > >>>>> the alternative is discussed in the section "Rejected
>> Alternatives".
>> > >> We
>> > >>>> are
>> > >>>>> hoping to borrow intelligence from the community to help us
>> resolve
>> > >> the
>> > >>>>> disagreement.
>> > >>>>>
>> > >>>>> Any feedback would be appreciated.
>> > >>>>>
>> > >>>>> Thank you~
>> > >>>>>
>> > >>>>> Xintong Song
>> > >>>>>
>> > >>>>>
>> > >>>>> [1]
>> > >>>>>
>> > >>>>>
>> > >>>>
>> > >>>
>> > >>
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-141%3A+Intra-Slot+Managed+Memory+Sharing#FLIP141:IntraSlotManagedMemorySharing-compatibility
>> > >>>>>
>> > >>>>> [2]
>> > >>>>>
>> > >>>>>
>> > >>>>
>> > >>>
>> > >>
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management
>> > >>>>>
>> > >>>>
>> > >>>
>> > >>
>> > >
>> >
>> >
>>
>

Re: [DISCUSS] FLIP-141: Intra-Slot Managed Memory Sharing

Reply via email to