Thanks all for the feedback and discussion. I have updated the FLIP, with the following changes.
- Choose the main proposal over the alternative approach - Combine weights of RocksDB and batch operators - Expose weights through configuration options, rather than via ExecutionConfig. - Add implementation plan. Please help take another look. Thank you~ Xintong Song On Wed, Sep 2, 2020 at 2:41 PM Xintong Song <tonysong...@gmail.com> wrote: > Thanks for the inputs, Aljoscha & Till. > > > # Weight Configuration > > > I think exposing the knobs incrementally is a good idea. However, I'm not > sure about non-configurable as the first step. > > > Currently, users can tune memory for rocksdb > ('taskmanager.memory.managed.size') and python > ('python.fn-execution.[framework|buffer].memory.size') separately, which > practically means any combination of rocksdb and python memory sizes. If we > switch to non-configurable weights, that will be a regression compared to > 1.11. > > > Therefore, I think exposing via configuration options might be a good > first step. And we can discuss exposing via ExecutionConfig if later we see > that requirement. > > > # Naming of Weights > > > I'm neutral for "Flink/Internal memory". > > > I think the reason we can combine weights for batch algorithms and state > backends is that they are never mixed together. My only concern > for "Flink/Internal memory", which might not be a problem at the moment, is > that what if new memory use cases appear in the future, which can also be > described by "Flink/Internal memory" but is not guaranteed not mixed with > batch algorithms or state backends? > > > Anyway, I think the naming should not block this FLIP, as long as we have > consensus on combining the two weights for rocksdb and batch algorithms. We > can keep the naming discussion open until the implementation phase. > > > Thank you~ > > Xintong Song > > > > On Tue, Sep 1, 2020 at 10:19 PM Till Rohrmann <trohrm...@apache.org> > wrote: > >> Thanks for creating this FLIP Xintong. >> >> I agree with the previous comments that the memory configuration should be >> as easy as possible. Every new knob has the potential to confuse users >> and/or allows him to shoot himself in the foot. Consequently, I am +1 for >> the first proposal in the FLIP since it is simpler. >> >> Also +1 for Stephan's proposal to combine batch operator's and >> RocksDB's memory usage into one weight. >> >> Concerning the names for the two weights, I fear that we are facing one of >> the two hard things in computer science. To add another idea, we could >> name >> them "Flink memory"/"Internal memory" and "Python memory". >> >> For the sake of making the scope of the FLIP as small as possible and to >> develop the feature incrementally, I think that Aljoscha's proposal to >> make >> it non-configurable for the first step sounds like a good idea. As a next >> step (and also if we see need), we can make the memory weights >> configurable >> via the configuration. And last, we could expose it via the >> ExecutionConfig >> if it is required. >> >> Cheers, >> Till >> >> On Tue, Sep 1, 2020 at 2:24 PM Aljoscha Krettek <aljos...@apache.org> >> wrote: >> >> > Hi, >> > >> > playing devils advocate here: should we even make the memory weights >> > configurable? We could go with weights that should make sense for most >> > cases in the first version and only introduce configurable weights when >> > (if) users need them. >> > >> > Regarding where/how things are configured, I think that most things >> > should be a ConfigOption first (Thanks cc'in me, Stephan!). This makes >> > them configurable via flink-conf.yaml and via command line parameters, >> > for example "bin/flink run -D memory.foo=bla ...". We can think about >> > offering programmatic API for cases where it makes sense, of course. >> > >> > Regarding naming one of the configurable weights >> > "StateBackend-BatchAlgorithm". I think it's not a good idea to be that >> > specific because the option will not age well. For example when we want >> > to change which group of memory consumers are configured together or >> > when we add something new. >> > >> > Best, >> > Aljoscha >> > >> > On 31.08.20 08:13, Xintong Song wrote: >> > > Thanks for the feedbacks, @Stephan >> > > >> > > >> > > - There is a push to make as much as possible configurable via the >> > main >> > >> configuration, and not only in code. Specifically values for >> operations >> > and >> > >> tuning. >> > >> I think it would be more important to have such memory weights >> in >> > the >> > >> config, compared to in the program API. /cc Aljoscha >> > > >> > > >> > > I can see the benefit that having memory weights in the main >> > configuration >> > > makes tuning easier, which makes great sense to me. On the other hand, >> > what >> > > we lose is the flexibility to have different weights for jobs running >> in >> > > the same Flink cluster. It seems to me the problem is that we don't >> have >> > an >> > > easy way to overwrite job-specific configurations without touching the >> > > codes. >> > > >> > > >> > > Given the current status, what if we make the memory weights >> configurable >> > > through both the main configuration and the programming API? The main >> > > configuration should take effect iff the weights are not explicitly >> > > specified through the programming API. In this way, job cluster users >> can >> > > easily tune the weight through the main configuration, while session >> > > cluster users, if they want to have different weights for jobs, can >> still >> > > overwrite the weight through execution configs. >> > > >> > > >> > > - My recommendation would be to keep this as simple as possible. >> This >> > >> will make a lot of configuration code harder, and make it harder for >> > users >> > >> to understand Flink's memory model. >> > >> Making things as easy for users to understand is very important >> in >> > my >> > >> opinion. In that regard, the main proposal in the FLIP seems better >> than >> > >> the alternative proposal listed at the end of the FLIP page. >> > > >> > > +1 from my side. >> > > >> > > >> > > - For the simplicity, we could go even further and simply have two >> > memory >> > >> users at the moment: The operator algorithm/data-structure and the >> > external >> > >> language process (Python for now). >> > >> We never have batch algos and RocksDB mixed, having this as >> > separate >> > >> options is confusing as it suggests this can be combined >> arbitrarily. I >> > >> also think that a slim possibility that we may ever combine this in >> the >> > >> future is not enough reason to make it more complex/confusing. >> > > >> > > >> > > Good point. +1 for combining batch/rocksdb weights, for they're never >> > mixed >> > > together. We can even just name it "StateBackend-BatchAlgorithm" for >> > > explicitly. >> > > >> > > >> > > For "external language process", I'm not entirely sure. Future >> external >> > > languages are possibly mixed with python processes. To avoid later >> > > considering how to share external language memory across different >> > > languages, I would suggest to present the concept as "python memory" >> > rather >> > > than "external language process memory". >> > > >> > > >> > > Thank you~ >> > > >> > > Xintong Song >> > > >> > > >> > > >> > > On Sun, Aug 30, 2020 at 10:19 PM Stephan Ewen <se...@apache.org> >> wrote: >> > > >> > >> Thanks for driving this proposal. A few thoughts on the current >> design: >> > >> >> > >> - There is a push to make as much as possible configurable via the >> > main >> > >> configuration, and not only in code. Specifically values for >> operations >> > and >> > >> tuning. >> > >> I think it would be more important to have such memory weights >> in >> > the >> > >> config, compared to in the program API. /cc Aljoscha >> > >> >> > >> - My recommendation would be to keep this as simple as possible. >> This >> > >> will make a lot of configuration code harder, and make it harder for >> > users >> > >> to understand Flink's memory model. >> > >> Making things as easy for users to understand is very important >> in >> > my >> > >> opinion. In that regard, the main proposal in the FLIP seems better >> than >> > >> the alternative proposal listed at the end of the FLIP page. >> > >> >> > >> - For the simplicity, we could go even further and simply have two >> > memory >> > >> users at the moment: The operator algorithm/data-structure and the >> > external >> > >> language process (Python for now). >> > >> We never have batch algos and RocksDB mixed, having this as >> > separate >> > >> options is confusing as it suggests this can be combined >> arbitrarily. I >> > >> also think that a slim possibility that we may ever combine this in >> the >> > >> future is not enough reason to make it more complex/confusing. >> > >> >> > >> - I am also not aware of any plans to combine the network and >> > operator >> > >> memory. Not that it would be infeasible to do this, but I think this >> > would >> > >> also be orthogonal to this change, and I am not sure this would be >> > solved >> > >> with static weights. So trying to get network memory into this >> proposal >> > >> seems pre-mature to me. >> > >> >> > >> Best, >> > >> Stephan >> > >> >> > >> >> > >> On Fri, Aug 28, 2020 at 10:48 AM Xintong Song <tonysong...@gmail.com >> > >> > >> wrote: >> > >> >> > >>>> >> > >>>> A quick question, does network memory treated as managed memory >> now? >> > Or >> > >>> in >> > >>>> the future? >> > >>>> >> > >>> No, network memory is independent from managed memory ATM. And I'm >> not >> > >>> aware of any plan to combine these two. >> > >>> >> > >>> Any insights there? >> > >>> >> > >>> Thank you~ >> > >>> >> > >>> Xintong Song >> > >>> >> > >>> >> > >>> >> > >>> On Fri, Aug 28, 2020 at 4:35 PM Kurt Young <ykt...@gmail.com> >> wrote: >> > >>> >> > >>>> A quick question, does network memory treated as managed memory >> now? >> > Or >> > >>> in >> > >>>> the future? >> > >>>> >> > >>>> Best, >> > >>>> Kurt >> > >>>> >> > >>>> >> > >>>> On Wed, Aug 26, 2020 at 5:32 PM Xintong Song < >> tonysong...@gmail.com> >> > >>>> wrote: >> > >>>> >> > >>>>> Hi devs, >> > >>>>> >> > >>>>> I'd like to bring the discussion over FLIP-141[1], which proposes >> how >> > >>>>> managed memory should be shared by various use cases within a >> slot. >> > >>> This >> > >>>> is >> > >>>>> an extension to FLIP-53[2], where we assumed that RocksDB state >> > >> backend >> > >>>> and >> > >>>>> batch operators are the only use cases of managed memory for >> > >> streaming >> > >>>> and >> > >>>>> batch jobs respectively, which is no longer true with the >> > >> introduction >> > >>> of >> > >>>>> Python UDFs. >> > >>>>> >> > >>>>> Please notice that we have not reached consensus between two >> > >> different >> > >>>>> designs. The major part of this FLIP describes one of the >> candidates, >> > >>>> while >> > >>>>> the alternative is discussed in the section "Rejected >> Alternatives". >> > >> We >> > >>>> are >> > >>>>> hoping to borrow intelligence from the community to help us >> resolve >> > >> the >> > >>>>> disagreement. >> > >>>>> >> > >>>>> Any feedback would be appreciated. >> > >>>>> >> > >>>>> Thank you~ >> > >>>>> >> > >>>>> Xintong Song >> > >>>>> >> > >>>>> >> > >>>>> [1] >> > >>>>> >> > >>>>> >> > >>>> >> > >>> >> > >> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-141%3A+Intra-Slot+Managed+Memory+Sharing#FLIP141:IntraSlotManagedMemorySharing-compatibility >> > >>>>> >> > >>>>> [2] >> > >>>>> >> > >>>>> >> > >>>> >> > >>> >> > >> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management >> > >>>>> >> > >>>> >> > >>> >> > >> >> > > >> > >> > >> >