Thanks all for the votes. So far, we have - 4 binding +1 votes (Stephan, Andrey, Till, Zhijiang) - 2 un-binding +1 votes (Xintong, Yu) - No -1 votes
There are more than 3 binding +1 votes and no -1 votes, and the voting time has past. According to the new bylaws, I'm happily to announce that FLIP-49 is approved to be adopted by Apache Flink. Regarding the minors mentioned during the voting, if there's no objection, I would like to update the FLIP document with the followings - Exclude JVM Overhead from '-XX:MaxDirectMemorySize' - Add a 'Follow-Up' section, with the follow-ups of web ui and documentation issues - Add a 'Limitation' section, with the Unsafe Java 12 support issue Thank you~ Xintong Song On Fri, Sep 6, 2019 at 10:28 AM Xintong Song <tonysong...@gmail.com> wrote: > +1 (non-binding) from my side. > > @Yu, thanks for the vote. > - The current FLIP document already mentioned the issue that Unsafe is not > supported in Java 12, in the section 'Unifying Explicit and Implicit Memory > Allocation'. It makes sense to me to emphasize this by adding a separate > limitation section. > - I think we should also update the FLIP document if we change the config > names later in PRs. But I would not consider this as a major change to the > FLIP that requires another vote, especially when we already agreed during > this vote to revisit the config names at implementation stage. > > Thank you~ > > Xintong Song > > > > On Fri, Sep 6, 2019 at 2:43 AM Yu Li <car...@gmail.com> wrote: > >> +1 (non-binding) >> >> Minor: >> 1. Is it worth a separate "Limitations" section to contain all relative >> information like the Unsafe support issue in Java 12, just like many other >> FLIPs? >> 2. About the config names, if we change them later in PR, does it mean we >> will need to update the FLIP document? If so, it seems we need another >> vote >> after the modification according to current bylaw? Or maybe we could just >> create a subpage under the FLIP and only re-vote on that part later? >> >> Thanks. >> >> Best Regards, >> Yu >> >> >> On Thu, 5 Sep 2019 at 00:52, Stephan Ewen <se...@apache.org> wrote: >> >> > Let's not block on config key names, just go ahead and we figure this >> out >> > concurrently or on the PR later. >> > >> > >> > On Wed, Sep 4, 2019 at 3:48 PM Stephan Ewen <se...@apache.org> wrote: >> > >> > > Maybe to clear up confusion about my suggestion: >> > > >> > > I would vote to keep the name of the config parameter >> > > "taskmanager.memory.network" because it is the same key as currently >> > (good >> > > to not break things unless good reason) and there currently is no >> case or >> > > even a concrete follow-up where we would actually differentiate >> between >> > > different types of network memory. >> > > >> > > I would suggest to not prematurely rename this because of something >> that >> > > might happen in the future. Experience shows that its better to do >> these >> > > things when the actual change comes. >> > > >> > > Side note: I am not sure "shuffle" is a good term in this context. I >> have >> > > so far only heard that in batch contexts, which is not what we do >> here. >> > One >> > > more reason for me to not pre-maturely change names. >> > > >> > > On Wed, Sep 4, 2019 at 10:56 AM Xintong Song <tonysong...@gmail.com> >> > > wrote: >> > > >> > >> @till >> > >> >> > >> > Just to clarify Xintong, you suggest that Task off-heap memory >> > >> represents >> > >> > direct and native memory. Since we don't know how the user will >> > allocate >> > >> > the memory we will add this value to -XX:MaxDirectMemorySize so >> that >> > the >> > >> > process won't fail if the user allocates only direct memory and no >> > >> native >> > >> > memory. Is that correct? >> > >> > >> > >> Yes, this is what I mean. >> > >> >> > >> >> > >> Thank you~ >> > >> >> > >> Xintong Song >> > >> >> > >> >> > >> >> > >> On Wed, Sep 4, 2019 at 4:25 PM Till Rohrmann <trohrm...@apache.org> >> > >> wrote: >> > >> >> > >> > Just to clarify Xintong, you suggest that Task off-heap memory >> > >> represents >> > >> > direct and native memory. Since we don't know how the user will >> > allocate >> > >> > the memory we will add this value to -XX:MaxDirectMemorySize so >> that >> > the >> > >> > process won't fail if the user allocates only direct memory and no >> > >> native >> > >> > memory. Is that correct? >> > >> > >> > >> > Cheers, >> > >> > Till >> > >> > >> > >> > On Wed, Sep 4, 2019 at 10:18 AM Xintong Song < >> tonysong...@gmail.com> >> > >> > wrote: >> > >> > >> > >> > > @Stephan >> > >> > > Not sure what do you mean by "just having this value". Are you >> > >> suggesting >> > >> > > that having "taskmanager.memory.network" refers to "shuffle" and >> > >> "other >> > >> > > network memory", or only "shuffle"? >> > >> > > >> > >> > > I guess what you mean is only "shuffle"? Because currently >> > >> > > "taskmanager.network.memory" refers to shuffle buffers only, >> which >> > is >> > >> > "one >> > >> > > less config value to break". >> > >> > > >> > >> > > Thank you~ >> > >> > > >> > >> > > Xintong Song >> > >> > > >> > >> > > >> > >> > > >> > >> > > On Wed, Sep 4, 2019 at 3:42 PM Stephan Ewen <se...@apache.org> >> > wrote: >> > >> > > >> > >> > > > If we later split the network memory into "shuffle" and "other >> > >> network >> > >> > > > memory", I think it would make sense to split the option then. >> > >> > > > >> > >> > > > In that case "taskmanager.memory.network" would probably refer >> to >> > >> the >> > >> > > total >> > >> > > > network memory, which would most likely be what most users >> > actually >> > >> > > > configure. >> > >> > > > My feeling is that for now just having this value is actually >> > >> easier, >> > >> > and >> > >> > > > it is one less config value to break (which is also good). >> > >> > > > >> > >> > > > On Wed, Sep 4, 2019 at 9:05 AM Xintong Song < >> > tonysong...@gmail.com> >> > >> > > wrote: >> > >> > > > >> > >> > > > > Thanks for the voting and comments. >> > >> > > > > >> > >> > > > > @Stephan >> > >> > > > > - The '-XX:MaxDirectMemorySize' value should not include JVM >> > >> > Overhead. >> > >> > > > > Thanks for correction. >> > >> > > > > - 'taskmanager.memory.framework.heap' it heap memory reserved >> > for >> > >> > task >> > >> > > > > executor framework, which can not be allocated to task >> slots. I >> > >> think >> > >> > > > users >> > >> > > > > should be able to configure both how many total java heap >> > memory a >> > >> > task >> > >> > > > > executor should have and how many of the total java heap >> memory >> > >> can >> > >> > be >> > >> > > > > allocated to task slots. >> > >> > > > > - Regarding network / shuffle memory, I'm with @Andrey. ATM, >> > this >> > >> > > memory >> > >> > > > > pool (derived from >> > >> "taskmanager.network.memory.[min/max/fraction]") >> > >> > is >> > >> > > > only >> > >> > > > > used inside NettyShuffleEnvironment as network buffers. There >> > >> might >> > >> > be >> > >> > > > > other network memory usage outside the shuffle component (as >> > >> > @Zhijiang >> > >> > > > also >> > >> > > > > suggested), but that is not accounted by this memory pool. >> This >> > is >> > >> > > > exactly >> > >> > > > > why I would suggest to change the name from 'network' to >> > >> 'shuffle'. >> > >> > > > > - I agree that we need very good documentation to explain the >> > >> memory >> > >> > > > pools >> > >> > > > > and config options, as well as WebUI to present the memory >> pool >> > >> > sizes. >> > >> > > I >> > >> > > > > would suggest to address these as follow-ups of all the three >> > >> > resource >> > >> > > > > management FLIPs, for better integration. >> > >> > > > > >> > >> > > > > @Andrey >> > >> > > > > - Agree with the 'shuffle' naming. See above. >> > >> > > > > >> > >> > > > > @Till >> > >> > > > > - My understanding is that Task Off-heap memory accounts for >> > both >> > >> > > direct >> > >> > > > > and native memory used by the user code. I'm not sure >> whether we >> > >> need >> > >> > > > > another configure option to split it. Given that we only >> decided >> > >> (in >> > >> > > the >> > >> > > > > discussion thread) to try it out the way we set >> > >> > > '-XX:MaxDirectMemorySize' >> > >> > > > > in current design and may switch to other alternatives if it >> > >> doesn't >> > >> > > work >> > >> > > > > out well, I would suggest the same for this one. I suggest >> that >> > we >> > >> > > first >> > >> > > > > try it without the splitting config option, and we can easily >> > add >> > >> it >> > >> > if >> > >> > > > it >> > >> > > > > doesn't work well. >> > >> > > > > - Agree that it's really important to have good documentation >> > for >> > >> > this. >> > >> > > > See >> > >> > > > > above. >> > >> > > > > >> > >> > > > > @Zhijiang >> > >> > > > > - Thanks for the input. My understanding is that 'shuffle >> > memory' >> > >> is >> > >> > a >> > >> > > > > portion of the task executor memory reserved for the shuffle >> > >> > component. >> > >> > > > The >> > >> > > > > way shuffle component use these memory (local buffer pool, >> netty >> > >> > > internal >> > >> > > > > memory, etc.) can be different depending on the shuffle >> > >> > implementation. >> > >> > > > The >> > >> > > > > task executor (outside of the shuffle implementation) should >> > only >> > >> > know >> > >> > > > the >> > >> > > > > overall memory usage of the shuffle component but no need to >> > >> > understand >> > >> > > > > more details inside the shuffle implementation. >> > >> > > > > >> > >> > > > > Thank you~ >> > >> > > > > >> > >> > > > > Xintong Song >> > >> > > > > >> > >> > > > > >> > >> > > > > >> > >> > > > > On Tue, Sep 3, 2019 at 10:41 PM zhijiang < >> > >> wangzhijiang...@aliyun.com >> > >> > > > > .invalid> >> > >> > > > > wrote: >> > >> > > > > >> > >> > > > > > Thanks for proposing this FLIP and also +1 on my side. >> > >> > > > > > >> > >> > > > > > @Andrey Zagrebin For the point of "network memory is >> actually >> > >> used >> > >> > > more >> > >> > > > > > than shuffling", I guess that the component of queryable >> state >> > >> is >> > >> > > also >> > >> > > > > > using network/netty stack atm, which is outside of >> shuffling. >> > >> > > > > > In addition, if we only consider the shuffle memory >> provided >> > by >> > >> > > shuffle >> > >> > > > > > service interface, we should not only consider the memory >> used >> > >> by >> > >> > > local >> > >> > > > > > buffer pool, but also consider the netty internal memory >> > >> > > > > > usages as the overhead, especially we have not the >> zero-copy >> > >> > > > improvement >> > >> > > > > > on dowstream read side. This issue might be out of the vote >> > >> scope, >> > >> > > just >> > >> > > > > > think of we have this issue in [1]. :) >> > >> > > > > > >> > >> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-12110 >> > >> > > > > > >> > >> > > > > > Best, >> > >> > > > > > Zhijiang >> > >> > > > > > >> > >> ------------------------------------------------------------------ >> > >> > > > > > From:Till Rohrmann <trohrm...@apache.org> >> > >> > > > > > Send Time:2019年9月3日(星期二) 15:07 >> > >> > > > > > To:dev <dev@flink.apache.org> >> > >> > > > > > Subject:Re: [VOTE] FLIP-49: Unified Memory Configuration >> for >> > >> > > > > TaskExecutors >> > >> > > > > > >> > >> > > > > > Thanks for creating this FLIP and starting the vote >> Xintong. >> > >> > > > > > >> > >> > > > > > +1 for the proposal from my side. >> > >> > > > > > >> > >> > > > > > I agree with Stephan that we might wanna revisit some of >> the >> > >> > > > > configuration >> > >> > > > > > names. >> > >> > > > > > >> > >> > > > > > If I understood it correctly, then Task Off-heap memory >> > >> represents >> > >> > > the >> > >> > > > > > direct memory used by the user code, right? How would users >> > >> > configure >> > >> > > > > > native memory requirements for the user code? If it is >> part of >> > >> Task >> > >> > > Off >> > >> > > > > > heap memory, then we need to split it to set >> > >> > -XX:MaxDirectMemorySize >> > >> > > > > > correctly or to introduce another configuration option. >> > >> > > > > > >> > >> > > > > > Given all these configuration options, I can see that users >> > will >> > >> > get >> > >> > > > > > confused quite easily. Therefore, I would like to emphasise >> > >> that we >> > >> > > > need >> > >> > > > > a >> > >> > > > > > very good documentation about how to properly configure >> Flink >> > >> > > processes >> > >> > > > > and >> > >> > > > > > which knobs to turn in which cases. >> > >> > > > > > >> > >> > > > > > Cheers, >> > >> > > > > > Till >> > >> > > > > > >> > >> > > > > > On Tue, Sep 3, 2019 at 2:34 PM Andrey Zagrebin < >> > >> > and...@ververica.com >> > >> > > > >> > >> > > > > > wrote: >> > >> > > > > > >> > >> > > > > > > Thanks for starting the vote Xintong >> > >> > > > > > > >> > >> > > > > > > Also +1 for the proposed FLIP-49. >> > >> > > > > > > >> > >> > > > > > > @Stephan regarding namings: network vs shuffle. >> > >> > > > > > > My understanding so far was that the network memory is >> what >> > we >> > >> > > > > basically >> > >> > > > > > > give to Shuffle implementations and default netty >> > >> implementation >> > >> > > uses >> > >> > > > > it >> > >> > > > > > in >> > >> > > > > > > particular mostly for networking. >> > >> > > > > > > Are the network pools used for something else outside of >> the >> > >> > > > shuffling >> > >> > > > > > > scope? >> > >> > > > > > > >> > >> > > > > > > best, >> > >> > > > > > > Andrey >> > >> > > > > > > >> > >> > > > > > > On Tue, Sep 3, 2019 at 1:01 PM Stephan Ewen < >> > se...@apache.org >> > >> > >> > >> > > > wrote: >> > >> > > > > > > >> > >> > > > > > > > +1 to the proposal in general >> > >> > > > > > > > >> > >> > > > > > > > A few things seems to be a bit put of sync with the >> latest >> > >> > > > > discussions >> > >> > > > > > > > though. >> > >> > > > > > > > >> > >> > > > > > > > The section about JVM Parameters states that the >> > >> > > > > > > > -XX:MaxDirectMemorySize value is set to Task Off-heap >> > >> Memory, >> > >> > > > Shuffle >> > >> > > > > > > > Memory and JVM Overhead. >> > >> > > > > > > > The way I understand the last discussion conclusion is >> > that >> > >> it >> > >> > is >> > >> > > > > only >> > >> > > > > > > the >> > >> > > > > > > > sum of shuffle memory and user-defined direct memory. >> > >> > > > > > > > >> > >> > > > > > > > I am someone neutral but unsure about is the separation >> > >> between >> > >> > > > > > > > "taskmanager.memory.framework.heap" and >> > >> > > > > "taskmanager.memory.task.heap". >> > >> > > > > > > > Could that be simply combined under >> > >> > > "taskmanager.memory.javaheap"? >> > >> > > > > > > > >> > >> > > > > > > > It might be good to also expose these values somehow in >> > the >> > >> web >> > >> > > UI >> > >> > > > so >> > >> > > > > > > that >> > >> > > > > > > > users see immediately what amount of memory TMs assume >> to >> > >> use >> > >> > for >> > >> > > > > what. >> > >> > > > > > > > >> > >> > > > > > > > I assume config key names and default values might be >> > >> adjusted >> > >> > > over >> > >> > > > > > time >> > >> > > > > > > as >> > >> > > > > > > > we get feedback. >> > >> > > > > > > > - I would keep the network memory under the name >> > >> > > > > > > > "taskmanager.memory.network". Because network memory is >> > >> > actually >> > >> > > > used >> > >> > > > > > for >> > >> > > > > > > > more than shuffling. Also, the old config key seems >> good, >> > so >> > >> > why >> > >> > > > > change >> > >> > > > > > > it? >> > >> > > > > > > > >> > >> > > > > > > > One thing to be aware of is that often, the Java Heap >> is >> > >> > > understood >> > >> > > > > as >> > >> > > > > > > > "managed memory" as a whole, because it is managed by >> the >> > GC >> > >> > not >> > >> > > > > > > explicitly >> > >> > > > > > > > by the user. >> > >> > > > > > > > So we need to make sure that we don't confuse users by >> > >> speaking >> > >> > > of >> > >> > > > > > > managed >> > >> > > > > > > > heap and unmanaged heap. All heap is managed in Java. >> Some >> > >> > memory >> > >> > > > is >> > >> > > > > > > > explicitly managed by Flink. >> > >> > > > > > > > >> > >> > > > > > > > Best, >> > >> > > > > > > > Stephan >> > >> > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > > On Mon, Sep 2, 2019 at 3:08 PM Xintong Song < >> > >> > > tonysong...@gmail.com >> > >> > > > > >> > >> > > > > > > wrote: >> > >> > > > > > > > >> > >> > > > > > > > > Hi everyone, >> > >> > > > > > > > > >> > >> > > > > > > > > I'm here to re-start the voting process for FLIP-49 >> [1], >> > >> with >> > >> > > > > respect >> > >> > > > > > > to >> > >> > > > > > > > > consensus reached in this thread [2] regarding some >> new >> > >> > > comments >> > >> > > > > and >> > >> > > > > > > > > concerns. >> > >> > > > > > > > > >> > >> > > > > > > > > This voting will be open for at least 72 hours. I'll >> try >> > >> to >> > >> > > close >> > >> > > > > it >> > >> > > > > > > Sep. >> > >> > > > > > > > > 5, 14:00 UTC, unless there is an objection or not >> enough >> > >> > votes. >> > >> > > > > > > > > >> > >> > > > > > > > > Thank you~ >> > >> > > > > > > > > >> > >> > > > > > > > > Xintong Song >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > > > > > [1] >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > >> > >> > > > > > >> > >> > > > > >> > >> > > > >> > >> > > >> > >> > >> > >> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors >> > >> > > > > > > > > [2] >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > >> > >> > > > > > >> > >> > > > > >> > >> > > > >> > >> > > >> > >> > >> > >> >> > >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-49-Unified-Memory-Configuration-for-TaskExecutors-td31436.html >> > >> > > > > > > > > >> > >> > > > > > > > > On Tue, Aug 27, 2019 at 9:29 PM Xintong Song < >> > >> > > > > tonysong...@gmail.com> >> > >> > > > > > > > > wrote: >> > >> > > > > > > > > >> > >> > > > > > > > > > Alright, then let's keep the discussion in the >> DISCUSS >> > >> > > mailing >> > >> > > > > > > thread, >> > >> > > > > > > > > and >> > >> > > > > > > > > > see whether we need to restart the vote. >> > >> > > > > > > > > > >> > >> > > > > > > > > > Thank you~ >> > >> > > > > > > > > > >> > >> > > > > > > > > > Xintong Song >> > >> > > > > > > > > > >> > >> > > > > > > > > > >> > >> > > > > > > > > > >> > >> > > > > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann < >> > >> > > > > > trohrm...@apache.org> >> > >> > > > > > > > > > wrote: >> > >> > > > > > > > > > >> > >> > > > > > > > > >> I had a couple of comments concerning the >> > >> implementation >> > >> > > plan. >> > >> > > > > > I've >> > >> > > > > > > > > posted >> > >> > > > > > > > > >> them to the original discussion thread. Depending >> on >> > >> the >> > >> > > > outcome >> > >> > > > > > of >> > >> > > > > > > > this >> > >> > > > > > > > > >> discussion we might need to restart the vote. >> > >> > > > > > > > > >> >> > >> > > > > > > > > >> Cheers, >> > >> > > > > > > > > >> Till >> > >> > > > > > > > > >> >> > >> > > > > > > > > >> On Tue, Aug 27, 2019 at 11:14 AM Xintong Song < >> > >> > > > > > > tonysong...@gmail.com> >> > >> > > > > > > > > >> wrote: >> > >> > > > > > > > > >> >> > >> > > > > > > > > >> > Hi all, >> > >> > > > > > > > > >> > >> > >> > > > > > > > > >> > I would like to start the voting process for >> > FLIP-49 >> > >> > [1], >> > >> > > > > which >> > >> > > > > > is >> > >> > > > > > > > > >> > discussed and reached consensus in this thread >> [2]. >> > >> > > > > > > > > >> > >> > >> > > > > > > > > >> > This voting will be open for at least 72 hours. >> > I'll >> > >> try >> > >> > > to >> > >> > > > > > close >> > >> > > > > > > it >> > >> > > > > > > > > >> Aug. >> > >> > > > > > > > > >> > 30 10:00 UTC, unless there is an objection or >> not >> > >> enough >> > >> > > > > votes. >> > >> > > > > > > > > >> > >> > >> > > > > > > > > >> > Thank you~ >> > >> > > > > > > > > >> > >> > >> > > > > > > > > >> > Xintong Song >> > >> > > > > > > > > >> > >> > >> > > > > > > > > >> > >> > >> > > > > > > > > >> > [1] >> > >> > > > > > > > > >> > >> > >> > > > > > > > > >> > >> > >> > > > > > > > > >> >> > >> > > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > >> > >> > > > > > >> > >> > > > > >> > >> > > > >> > >> > > >> > >> > >> > >> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors >> > >> > > > > > > > > >> > [2] >> > >> > > > > > > > > >> > >> > >> > > > > > > > > >> > >> > >> > > > > > > > > >> >> > >> > > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > >> > >> > > > > > >> > >> > > > > >> > >> > > > >> > >> > > >> > >> > >> > >> >> > >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-49-Unified-Memory-Configuration-for-TaskExecutors-td31436.html >> > >> > > > > > > > > >> > >> > >> > > > > > > > > >> >> > >> > > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > >> > >> > > > > > >> > >> > > > > > >> > >> > > > > >> > >> > > > >> > >> > > >> > >> > >> > >> >> > > >> > >> >