Thanks all for the votes.
So far, we have

   - 4 binding +1 votes (Stephan, Andrey, Till, Zhijiang)
   - 2 un-binding +1 votes (Xintong, Yu)
   - No -1 votes

There are more than 3 binding +1 votes and no -1 votes, and the voting time
has past. According to the new bylaws, I'm happily to announce that FLIP-49
is approved to be adopted by Apache Flink.

Regarding the minors mentioned during the voting, if there's no objection,
I would like to update the FLIP document with the followings

   - Exclude JVM Overhead from '-XX:MaxDirectMemorySize'
   - Add a 'Follow-Up' section, with the follow-ups of web ui and
   documentation issues
   - Add a 'Limitation' section, with the Unsafe Java 12 support issue


Thank you~

Xintong Song



On Fri, Sep 6, 2019 at 10:28 AM Xintong Song <tonysong...@gmail.com> wrote:

> +1 (non-binding) from my side.
>
> @Yu, thanks for the vote.
> - The current FLIP document already mentioned the issue that Unsafe is not
> supported in Java 12, in the section 'Unifying Explicit and Implicit Memory
> Allocation'. It makes sense to me to emphasize this by adding a separate
> limitation section.
> - I think we should also update the FLIP document if we change the config
> names later in PRs. But I would not consider this as a major change to the
> FLIP that requires another vote, especially when we already agreed during
> this vote to revisit the config names at implementation stage.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Fri, Sep 6, 2019 at 2:43 AM Yu Li <car...@gmail.com> wrote:
>
>> +1 (non-binding)
>>
>> Minor:
>> 1. Is it worth a separate "Limitations" section to contain all relative
>> information like the Unsafe support issue in Java 12, just like many other
>> FLIPs?
>> 2. About the config names, if we change them later in PR, does it mean we
>> will need to update the FLIP document? If so, it seems we need another
>> vote
>> after the modification according to current bylaw? Or maybe we could just
>> create a subpage under the FLIP and only re-vote on that part later?
>>
>> Thanks.
>>
>> Best Regards,
>> Yu
>>
>>
>> On Thu, 5 Sep 2019 at 00:52, Stephan Ewen <se...@apache.org> wrote:
>>
>> > Let's not block on config key names, just go ahead and we figure this
>> out
>> > concurrently or on the PR later.
>> >
>> >
>> > On Wed, Sep 4, 2019 at 3:48 PM Stephan Ewen <se...@apache.org> wrote:
>> >
>> > > Maybe to clear up confusion about my suggestion:
>> > >
>> > > I would vote to keep the name of the config parameter
>> > > "taskmanager.memory.network" because it is the same key as currently
>> > (good
>> > > to not break things unless good reason) and there currently is no
>> case or
>> > > even a concrete follow-up where we would actually differentiate
>> between
>> > > different types of network memory.
>> > >
>> > > I would suggest to not prematurely rename this because of something
>> that
>> > > might happen in the future. Experience shows that its better to do
>> these
>> > > things when the actual change comes.
>> > >
>> > > Side note: I am not sure "shuffle" is a good term in this context. I
>> have
>> > > so far only heard that in batch contexts, which is not what we do
>> here.
>> > One
>> > > more reason for me to not pre-maturely change names.
>> > >
>> > > On Wed, Sep 4, 2019 at 10:56 AM Xintong Song <tonysong...@gmail.com>
>> > > wrote:
>> > >
>> > >> @till
>> > >>
>> > >> > Just to clarify Xintong, you suggest that Task off-heap memory
>> > >> represents
>> > >> > direct and native memory. Since we don't know how the user will
>> > allocate
>> > >> > the memory we will add this value to -XX:MaxDirectMemorySize so
>> that
>> > the
>> > >> > process won't fail if the user allocates only direct memory and no
>> > >> native
>> > >> > memory. Is that correct?
>> > >> >
>> > >> Yes, this is what I mean.
>> > >>
>> > >>
>> > >> Thank you~
>> > >>
>> > >> Xintong Song
>> > >>
>> > >>
>> > >>
>> > >> On Wed, Sep 4, 2019 at 4:25 PM Till Rohrmann <trohrm...@apache.org>
>> > >> wrote:
>> > >>
>> > >> > Just to clarify Xintong, you suggest that Task off-heap memory
>> > >> represents
>> > >> > direct and native memory. Since we don't know how the user will
>> > allocate
>> > >> > the memory we will add this value to -XX:MaxDirectMemorySize so
>> that
>> > the
>> > >> > process won't fail if the user allocates only direct memory and no
>> > >> native
>> > >> > memory. Is that correct?
>> > >> >
>> > >> > Cheers,
>> > >> > Till
>> > >> >
>> > >> > On Wed, Sep 4, 2019 at 10:18 AM Xintong Song <
>> tonysong...@gmail.com>
>> > >> > wrote:
>> > >> >
>> > >> > > @Stephan
>> > >> > > Not sure what do you mean by "just having this value". Are you
>> > >> suggesting
>> > >> > > that having "taskmanager.memory.network" refers to "shuffle" and
>> > >> "other
>> > >> > > network memory", or only "shuffle"?
>> > >> > >
>> > >> > > I guess what you mean is only "shuffle"? Because currently
>> > >> > > "taskmanager.network.memory" refers to shuffle buffers only,
>> which
>> > is
>> > >> > "one
>> > >> > > less config value to break".
>> > >> > >
>> > >> > > Thank you~
>> > >> > >
>> > >> > > Xintong Song
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > > On Wed, Sep 4, 2019 at 3:42 PM Stephan Ewen <se...@apache.org>
>> > wrote:
>> > >> > >
>> > >> > > > If we later split the network memory into "shuffle" and "other
>> > >> network
>> > >> > > > memory", I think it would make sense to split the option then.
>> > >> > > >
>> > >> > > > In that case "taskmanager.memory.network" would probably refer
>> to
>> > >> the
>> > >> > > total
>> > >> > > > network memory, which would most likely be what most users
>> > actually
>> > >> > > > configure.
>> > >> > > > My feeling is that for now just having this value is actually
>> > >> easier,
>> > >> > and
>> > >> > > > it is one less config value to break (which is also good).
>> > >> > > >
>> > >> > > > On Wed, Sep 4, 2019 at 9:05 AM Xintong Song <
>> > tonysong...@gmail.com>
>> > >> > > wrote:
>> > >> > > >
>> > >> > > > > Thanks for the voting and comments.
>> > >> > > > >
>> > >> > > > > @Stephan
>> > >> > > > > - The '-XX:MaxDirectMemorySize' value should not include JVM
>> > >> > Overhead.
>> > >> > > > > Thanks for correction.
>> > >> > > > > - 'taskmanager.memory.framework.heap' it heap memory reserved
>> > for
>> > >> > task
>> > >> > > > > executor framework, which can not be allocated to task
>> slots. I
>> > >> think
>> > >> > > > users
>> > >> > > > > should be able to configure both how many total java heap
>> > memory a
>> > >> > task
>> > >> > > > > executor should have and how many of the total java heap
>> memory
>> > >> can
>> > >> > be
>> > >> > > > > allocated to task slots.
>> > >> > > > > - Regarding network / shuffle memory, I'm with @Andrey. ATM,
>> > this
>> > >> > > memory
>> > >> > > > > pool (derived from
>> > >> "taskmanager.network.memory.[min/max/fraction]")
>> > >> > is
>> > >> > > > only
>> > >> > > > > used inside NettyShuffleEnvironment as network buffers. There
>> > >> might
>> > >> > be
>> > >> > > > > other network memory usage outside the shuffle component (as
>> > >> > @Zhijiang
>> > >> > > > also
>> > >> > > > > suggested), but that is not accounted by this memory pool.
>> This
>> > is
>> > >> > > > exactly
>> > >> > > > > why I would suggest to change the name from 'network' to
>> > >> 'shuffle'.
>> > >> > > > > - I agree that we need very good documentation to explain the
>> > >> memory
>> > >> > > > pools
>> > >> > > > > and config options, as well as WebUI to present the memory
>> pool
>> > >> > sizes.
>> > >> > > I
>> > >> > > > > would suggest to address these as follow-ups of all the three
>> > >> > resource
>> > >> > > > > management FLIPs, for better integration.
>> > >> > > > >
>> > >> > > > > @Andrey
>> > >> > > > > - Agree with the 'shuffle' naming. See above.
>> > >> > > > >
>> > >> > > > > @Till
>> > >> > > > > - My understanding is that Task Off-heap memory accounts for
>> > both
>> > >> > > direct
>> > >> > > > > and native memory used by the user code. I'm not sure
>> whether we
>> > >> need
>> > >> > > > > another configure option to split it. Given that we only
>> decided
>> > >> (in
>> > >> > > the
>> > >> > > > > discussion thread) to try it out the way we set
>> > >> > > '-XX:MaxDirectMemorySize'
>> > >> > > > > in current design and may switch to other alternatives if it
>> > >> doesn't
>> > >> > > work
>> > >> > > > > out well, I would suggest the same for this one. I suggest
>> that
>> > we
>> > >> > > first
>> > >> > > > > try it without the splitting config option, and we can easily
>> > add
>> > >> it
>> > >> > if
>> > >> > > > it
>> > >> > > > > doesn't work well.
>> > >> > > > > - Agree that it's really important to have good documentation
>> > for
>> > >> > this.
>> > >> > > > See
>> > >> > > > > above.
>> > >> > > > >
>> > >> > > > > @Zhijiang
>> > >> > > > > - Thanks for the input. My understanding is that 'shuffle
>> > memory'
>> > >> is
>> > >> > a
>> > >> > > > > portion of the task executor memory reserved for the shuffle
>> > >> > component.
>> > >> > > > The
>> > >> > > > > way shuffle component use these memory (local buffer pool,
>> netty
>> > >> > > internal
>> > >> > > > > memory, etc.) can be different depending on the shuffle
>> > >> > implementation.
>> > >> > > > The
>> > >> > > > > task executor (outside of the shuffle implementation) should
>> > only
>> > >> > know
>> > >> > > > the
>> > >> > > > > overall memory usage of the shuffle component but no need to
>> > >> > understand
>> > >> > > > > more details inside the shuffle implementation.
>> > >> > > > >
>> > >> > > > > Thank you~
>> > >> > > > >
>> > >> > > > > Xintong Song
>> > >> > > > >
>> > >> > > > >
>> > >> > > > >
>> > >> > > > > On Tue, Sep 3, 2019 at 10:41 PM zhijiang <
>> > >> wangzhijiang...@aliyun.com
>> > >> > > > > .invalid>
>> > >> > > > > wrote:
>> > >> > > > >
>> > >> > > > > > Thanks for proposing this FLIP and also +1 on my side.
>> > >> > > > > >
>> > >> > > > > > @Andrey Zagrebin For the point of "network memory is
>> actually
>> > >> used
>> > >> > > more
>> > >> > > > > > than shuffling", I guess that the component of queryable
>> state
>> > >> is
>> > >> > > also
>> > >> > > > > > using network/netty stack atm, which is outside of
>> shuffling.
>> > >> > > > > > In addition, if we only consider the shuffle memory
>> provided
>> > by
>> > >> > > shuffle
>> > >> > > > > > service interface, we should not only consider the memory
>> used
>> > >> by
>> > >> > > local
>> > >> > > > > > buffer pool, but also consider the netty internal memory
>> > >> > > > > > usages as the overhead, especially we have not the
>> zero-copy
>> > >> > > > improvement
>> > >> > > > > > on dowstream read side. This issue might be out of the vote
>> > >> scope,
>> > >> > > just
>> > >> > > > > > think of we have this issue in [1]. :)
>> > >> > > > > >
>> > >> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-12110
>> > >> > > > > >
>> > >> > > > > > Best,
>> > >> > > > > > Zhijiang
>> > >> > > > > >
>> > >> ------------------------------------------------------------------
>> > >> > > > > > From:Till Rohrmann <trohrm...@apache.org>
>> > >> > > > > > Send Time:2019年9月3日(星期二) 15:07
>> > >> > > > > > To:dev <dev@flink.apache.org>
>> > >> > > > > > Subject:Re: [VOTE] FLIP-49: Unified Memory Configuration
>> for
>> > >> > > > > TaskExecutors
>> > >> > > > > >
>> > >> > > > > > Thanks for creating this FLIP and starting the vote
>> Xintong.
>> > >> > > > > >
>> > >> > > > > > +1 for the proposal from my side.
>> > >> > > > > >
>> > >> > > > > > I agree with Stephan that we might wanna revisit some of
>> the
>> > >> > > > > configuration
>> > >> > > > > > names.
>> > >> > > > > >
>> > >> > > > > > If I understood it correctly, then Task Off-heap memory
>> > >> represents
>> > >> > > the
>> > >> > > > > > direct memory used by the user code, right? How would users
>> > >> > configure
>> > >> > > > > > native memory requirements for the user code? If it is
>> part of
>> > >> Task
>> > >> > > Off
>> > >> > > > > > heap memory, then we need to split it to set
>> > >> > -XX:MaxDirectMemorySize
>> > >> > > > > > correctly or to introduce another configuration option.
>> > >> > > > > >
>> > >> > > > > > Given all these configuration options, I can see that users
>> > will
>> > >> > get
>> > >> > > > > > confused quite easily. Therefore, I would like to emphasise
>> > >> that we
>> > >> > > > need
>> > >> > > > > a
>> > >> > > > > > very good documentation about how to properly configure
>> Flink
>> > >> > > processes
>> > >> > > > > and
>> > >> > > > > > which knobs to turn in which cases.
>> > >> > > > > >
>> > >> > > > > > Cheers,
>> > >> > > > > > Till
>> > >> > > > > >
>> > >> > > > > > On Tue, Sep 3, 2019 at 2:34 PM Andrey Zagrebin <
>> > >> > and...@ververica.com
>> > >> > > >
>> > >> > > > > > wrote:
>> > >> > > > > >
>> > >> > > > > > > Thanks for starting the vote Xintong
>> > >> > > > > > >
>> > >> > > > > > > Also +1 for the proposed FLIP-49.
>> > >> > > > > > >
>> > >> > > > > > > @Stephan regarding namings: network vs shuffle.
>> > >> > > > > > > My understanding so far was that the network memory is
>> what
>> > we
>> > >> > > > > basically
>> > >> > > > > > > give to Shuffle implementations and default netty
>> > >> implementation
>> > >> > > uses
>> > >> > > > > it
>> > >> > > > > > in
>> > >> > > > > > > particular mostly for networking.
>> > >> > > > > > > Are the network pools used for something else outside of
>> the
>> > >> > > > shuffling
>> > >> > > > > > > scope?
>> > >> > > > > > >
>> > >> > > > > > > best,
>> > >> > > > > > > Andrey
>> > >> > > > > > >
>> > >> > > > > > > On Tue, Sep 3, 2019 at 1:01 PM Stephan Ewen <
>> > se...@apache.org
>> > >> >
>> > >> > > > wrote:
>> > >> > > > > > >
>> > >> > > > > > > > +1 to the proposal in general
>> > >> > > > > > > >
>> > >> > > > > > > > A few things seems to be a bit put of sync with the
>> latest
>> > >> > > > > discussions
>> > >> > > > > > > > though.
>> > >> > > > > > > >
>> > >> > > > > > > > The section about JVM Parameters states that the
>> > >> > > > > > > > -XX:MaxDirectMemorySize value is set to Task Off-heap
>> > >> Memory,
>> > >> > > > Shuffle
>> > >> > > > > > > > Memory and JVM Overhead.
>> > >> > > > > > > > The way I understand the last discussion conclusion is
>> > that
>> > >> it
>> > >> > is
>> > >> > > > > only
>> > >> > > > > > > the
>> > >> > > > > > > > sum of shuffle memory and user-defined direct memory.
>> > >> > > > > > > >
>> > >> > > > > > > > I am someone neutral but unsure about is the separation
>> > >> between
>> > >> > > > > > > > "taskmanager.memory.framework.heap" and
>> > >> > > > > "taskmanager.memory.task.heap".
>> > >> > > > > > > > Could that be simply combined under
>> > >> > > "taskmanager.memory.javaheap"?
>> > >> > > > > > > >
>> > >> > > > > > > > It might be good to also expose these values somehow in
>> > the
>> > >> web
>> > >> > > UI
>> > >> > > > so
>> > >> > > > > > > that
>> > >> > > > > > > > users see immediately what amount of memory TMs assume
>> to
>> > >> use
>> > >> > for
>> > >> > > > > what.
>> > >> > > > > > > >
>> > >> > > > > > > > I assume config key names and default values might be
>> > >> adjusted
>> > >> > > over
>> > >> > > > > > time
>> > >> > > > > > > as
>> > >> > > > > > > > we get feedback.
>> > >> > > > > > > >   - I would keep the network memory under the name
>> > >> > > > > > > > "taskmanager.memory.network". Because network memory is
>> > >> > actually
>> > >> > > > used
>> > >> > > > > > for
>> > >> > > > > > > > more than shuffling. Also, the old config key seems
>> good,
>> > so
>> > >> > why
>> > >> > > > > change
>> > >> > > > > > > it?
>> > >> > > > > > > >
>> > >> > > > > > > > One thing to be aware of is that often, the Java Heap
>> is
>> > >> > > understood
>> > >> > > > > as
>> > >> > > > > > > > "managed memory" as a whole, because it is managed by
>> the
>> > GC
>> > >> > not
>> > >> > > > > > > explicitly
>> > >> > > > > > > > by the user.
>> > >> > > > > > > > So we need to make sure that we don't confuse users by
>> > >> speaking
>> > >> > > of
>> > >> > > > > > > managed
>> > >> > > > > > > > heap and unmanaged heap. All heap is managed in Java.
>> Some
>> > >> > memory
>> > >> > > > is
>> > >> > > > > > > > explicitly managed by Flink.
>> > >> > > > > > > >
>> > >> > > > > > > > Best,
>> > >> > > > > > > > Stephan
>> > >> > > > > > > >
>> > >> > > > > > > >
>> > >> > > > > > > > On Mon, Sep 2, 2019 at 3:08 PM Xintong Song <
>> > >> > > tonysong...@gmail.com
>> > >> > > > >
>> > >> > > > > > > wrote:
>> > >> > > > > > > >
>> > >> > > > > > > > > Hi everyone,
>> > >> > > > > > > > >
>> > >> > > > > > > > > I'm here to re-start the voting process for FLIP-49
>> [1],
>> > >> with
>> > >> > > > > respect
>> > >> > > > > > > to
>> > >> > > > > > > > > consensus reached in this thread [2] regarding some
>> new
>> > >> > > comments
>> > >> > > > > and
>> > >> > > > > > > > > concerns.
>> > >> > > > > > > > >
>> > >> > > > > > > > > This voting will be open for at least 72 hours. I'll
>> try
>> > >> to
>> > >> > > close
>> > >> > > > > it
>> > >> > > > > > > Sep.
>> > >> > > > > > > > > 5, 14:00 UTC, unless there is an objection or not
>> enough
>> > >> > votes.
>> > >> > > > > > > > >
>> > >> > > > > > > > > Thank you~
>> > >> > > > > > > > >
>> > >> > > > > > > > > Xintong Song
>> > >> > > > > > > > >
>> > >> > > > > > > > >
>> > >> > > > > > > > > [1]
>> > >> > > > > > > > >
>> > >> > > > > > > > >
>> > >> > > > > > > >
>> > >> > > > > > >
>> > >> > > > > >
>> > >> > > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
>> > >> > > > > > > > > [2]
>> > >> > > > > > > > >
>> > >> > > > > > > > >
>> > >> > > > > > > >
>> > >> > > > > > >
>> > >> > > > > >
>> > >> > > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> >
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-49-Unified-Memory-Configuration-for-TaskExecutors-td31436.html
>> > >> > > > > > > > >
>> > >> > > > > > > > > On Tue, Aug 27, 2019 at 9:29 PM Xintong Song <
>> > >> > > > > tonysong...@gmail.com>
>> > >> > > > > > > > > wrote:
>> > >> > > > > > > > >
>> > >> > > > > > > > > > Alright, then let's keep the discussion in the
>> DISCUSS
>> > >> > > mailing
>> > >> > > > > > > thread,
>> > >> > > > > > > > > and
>> > >> > > > > > > > > > see whether we need to restart the vote.
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > Thank you~
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > Xintong Song
>> > >> > > > > > > > > >
>> > >> > > > > > > > > >
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann <
>> > >> > > > > > trohrm...@apache.org>
>> > >> > > > > > > > > > wrote:
>> > >> > > > > > > > > >
>> > >> > > > > > > > > >> I had a couple of comments concerning the
>> > >> implementation
>> > >> > > plan.
>> > >> > > > > > I've
>> > >> > > > > > > > > posted
>> > >> > > > > > > > > >> them to the original discussion thread. Depending
>> on
>> > >> the
>> > >> > > > outcome
>> > >> > > > > > of
>> > >> > > > > > > > this
>> > >> > > > > > > > > >> discussion we might need to restart the vote.
>> > >> > > > > > > > > >>
>> > >> > > > > > > > > >> Cheers,
>> > >> > > > > > > > > >> Till
>> > >> > > > > > > > > >>
>> > >> > > > > > > > > >> On Tue, Aug 27, 2019 at 11:14 AM Xintong Song <
>> > >> > > > > > > tonysong...@gmail.com>
>> > >> > > > > > > > > >> wrote:
>> > >> > > > > > > > > >>
>> > >> > > > > > > > > >> > Hi all,
>> > >> > > > > > > > > >> >
>> > >> > > > > > > > > >> > I would like to start the voting process for
>> > FLIP-49
>> > >> > [1],
>> > >> > > > > which
>> > >> > > > > > is
>> > >> > > > > > > > > >> > discussed and reached consensus in this thread
>> [2].
>> > >> > > > > > > > > >> >
>> > >> > > > > > > > > >> > This voting will be open for at least 72 hours.
>> > I'll
>> > >> try
>> > >> > > to
>> > >> > > > > > close
>> > >> > > > > > > it
>> > >> > > > > > > > > >> Aug.
>> > >> > > > > > > > > >> > 30 10:00 UTC, unless there is an objection or
>> not
>> > >> enough
>> > >> > > > > votes.
>> > >> > > > > > > > > >> >
>> > >> > > > > > > > > >> > Thank you~
>> > >> > > > > > > > > >> >
>> > >> > > > > > > > > >> > Xintong Song
>> > >> > > > > > > > > >> >
>> > >> > > > > > > > > >> >
>> > >> > > > > > > > > >> > [1]
>> > >> > > > > > > > > >> >
>> > >> > > > > > > > > >> >
>> > >> > > > > > > > > >>
>> > >> > > > > > > > >
>> > >> > > > > > > >
>> > >> > > > > > >
>> > >> > > > > >
>> > >> > > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
>> > >> > > > > > > > > >> > [2]
>> > >> > > > > > > > > >> >
>> > >> > > > > > > > > >> >
>> > >> > > > > > > > > >>
>> > >> > > > > > > > >
>> > >> > > > > > > >
>> > >> > > > > > >
>> > >> > > > > >
>> > >> > > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> >
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-49-Unified-Memory-Configuration-for-TaskExecutors-td31436.html
>> > >> > > > > > > > > >> >
>> > >> > > > > > > > > >>
>> > >> > > > > > > > > >
>> > >> > > > > > > > >
>> > >> > > > > > > >
>> > >> > > > > > >
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> > >
>> >
>>
>

Reply via email to