Re: [DISCUSS] FLIP-49: Unified Memory Configuration for TaskExecutors

Xintong Song Sun, 01 Sep 2019 20:39:57 -0700

I just updated the FLIP wiki page [1], with the following changes:

   - Network memory uses JVM direct memory, and is accounted when setting
   JVM max direct memory size parameter.
   - Use dynamic configurations (`-Dkey=value`) to pass calculated memory
   configs into TaskExecutors, instead of ENV variables.
   - Remove 'supporting memory reservation' from the scope of this FLIP.


@till @stephan, please take another look see if there are any other
concerns.

Thank you~

Xintong Song


[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors

On Mon, Sep 2, 2019 at 11:13 AM Xintong Song <tonysong...@gmail.com> wrote:

> Sorry for the late response.
>
> - Regarding the `TaskExecutorSpecifics` naming, let's discuss the detail
> in PR.
> - Regarding passing parameters into the `TaskExecutor`, +1 for using
> dynamic configuration at the moment, given that there are more questions to
> be discussed to have a general framework for overwriting configurations
> with ENV variables.
> - Regarding memory reservation, I double checked with Yu and he will take
> care of it.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Thu, Aug 29, 2019 at 7:35 PM Till Rohrmann <trohrm...@apache.org>
> wrote:
>
>> What I forgot to add is that we could tackle specifying the configuration
>> fully in an incremental way and that the full specification should be the
>> desired end state.
>>
>> On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann <trohrm...@apache.org>
>> wrote:
>>
>> > I think our goal should be that the configuration is fully specified
>> when
>> > the process is started. By considering the internal calculation step to
>> be
>> > rather validate existing values and calculate missing ones, these two
>> > proposal shouldn't even conflict (given determinism).
>> >
>> > Since we don't want to change an existing flink-conf.yaml, specifying
>> the
>> > full configuration would require to pass in the options differently.
>> >
>> > One way could be the ENV variables approach. The reason why I'm trying
>> to
>> > exclude this feature from the FLIP is that I believe it needs a bit more
>> > discussion. Just some questions which come to my mind: What would be the
>> > exact format (FLINK_KEY_NAME)? Would we support a dot separator which is
>> > supported by some systems (FLINK.KEY.NAME)? If we accept the dot
>> > separator what would be the order of precedence if there are two ENV
>> > variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? What is the
>> > precedence of env variable vs. dynamic configuration value specified
>> via -D?
>> >
>> > Another approach could be to pass in the dynamic configuration values
>> via
>> > `-Dkey=value` to the Flink process. For that we don't have to change
>> > anything because the functionality already exists.
>> >
>> > Cheers,
>> > Till
>> >
>> > On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen <se...@apache.org> wrote:
>> >
>> >> I see. Under the assumption of strict determinism that should work.
>> >>
>> >> The original proposal had this point "don't compute inside the TM,
>> compute
>> >> outside and supply a full config", because that sounded more intuitive.
>> >>
>> >> On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann <trohrm...@apache.org>
>> >> wrote:
>> >>
>> >> > My understanding was that before starting the Flink process we call a
>> >> > utility which calculates these values. I assume that this utility
>> will
>> >> do
>> >> > the calculation based on a set of configured values (process memory,
>> >> flink
>> >> > memory, network memory etc.). Assuming that these values don't differ
>> >> from
>> >> > the values with which the JVM is started, it should be possible to
>> >> > recompute them in the Flink process in order to set the values.
>> >> >
>> >> >
>> >> >
>> >> > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen <se...@apache.org>
>> wrote:
>> >> >
>> >> > > When computing the values in the JVM process after it started, how
>> >> would
>> >> > > you deal with values like Max Direct Memory, Metaspace size. native
>> >> > memory
>> >> > > reservation (reduce heap size), etc? All the values that are
>> >> parameters
>> >> > to
>> >> > > the JVM process and that need to be supplied at process startup?
>> >> > >
>> >> > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann <
>> trohrm...@apache.org>
>> >> > > wrote:
>> >> > >
>> >> > > > Thanks for the clarification. I have some more comments:
>> >> > > >
>> >> > > > - I would actually split the logic to compute the process memory
>> >> > > > requirements and storing the values into two things. E.g. one
>> could
>> >> > name
>> >> > > > the former TaskExecutorProcessUtility and  the latter
>> >> > > > TaskExecutorProcessMemory. But we can discuss this on the PR
>> since
>> >> it's
>> >> > > > just a naming detail.
>> >> > > >
>> >> > > > - Generally, I'm not opposed to making configuration values
>> >> overridable
>> >> > > by
>> >> > > > ENV variables. I think this is a very good idea and makes the
>> >> > > > configurability of Flink processes easier. However, I think that
>> >> adding
>> >> > > > this functionality should not be part of this FLIP because it
>> would
>> >> > > simply
>> >> > > > widen the scope unnecessarily.
>> >> > > >
>> >> > > > The reasons why I believe it is unnecessary are the following:
>> For
>> >> Yarn
>> >> > > we
>> >> > > > already create write a flink-conf.yaml which could be populated
>> with
>> >> > the
>> >> > > > memory settings. For the other processes it should not make a
>> >> > difference
>> >> > > > whether the loaded Configuration is populated with the memory
>> >> settings
>> >> > > from
>> >> > > > ENV variables or by using TaskExecutorProcessUtility to compute
>> the
>> >> > > missing
>> >> > > > values from the loaded configuration. If the latter would not be
>> >> > possible
>> >> > > > (wrong or missing configuration values), then we should not have
>> >> been
>> >> > > able
>> >> > > > to actually start the process in the first place.
>> >> > > >
>> >> > > > - Concerning the memory reservation: I agree with you that we
>> need
>> >> the
>> >> > > > memory reservation functionality to make streaming jobs work with
>> >> > > "managed"
>> >> > > > memory. However, w/o this functionality the whole Flip would
>> already
>> >> > > bring
>> >> > > > a good amount of improvements to our users when running batch
>> jobs.
>> >> > > > Moreover, by keeping the scope smaller we can complete the FLIP
>> >> faster.
>> >> > > > Hence, I would propose to address the memory reservation
>> >> functionality
>> >> > > as a
>> >> > > > follow up FLIP (which Yu is working on if I'm not mistaken).
>> >> > > >
>> >> > > > Cheers,
>> >> > > > Till
>> >> > > >
>> >> > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang <
>> danrtsey...@gmail.com>
>> >> > > wrote:
>> >> > > >
>> >> > > > > Just add my 2 cents.
>> >> > > > >
>> >> > > > > Using environment variables to override the configuration for
>> >> > different
>> >> > > > > taskmanagers is better.
>> >> > > > > We do not need to generate dedicated flink-conf.yaml for all
>> >> > > > taskmanagers.
>> >> > > > > A common flink-conf.yam and different environment variables are
>> >> > enough.
>> >> > > > > By reducing the distributed cached files, it could make
>> launching
>> >> a
>> >> > > > > taskmanager faster.
>> >> > > > >
>> >> > > > > Stephan gives a good suggestion that we could move the logic
>> into
>> >> > > > > "GlobalConfiguration.loadConfig()" method.
>> >> > > > > Maybe the client could also benefit from this. Different users
>> do
>> >> not
>> >> > > > have
>> >> > > > > to export FLINK_CONF_DIR to update few config options.
>> >> > > > >
>> >> > > > >
>> >> > > > > Best,
>> >> > > > > Yang
>> >> > > > >
>> >> > > > > Stephan Ewen <se...@apache.org> 于2019年8月28日周三 上午1:21写道：
>> >> > > > >
>> >> > > > > > One note on the Environment Variables and Configuration
>> >> discussion.
>> >> > > > > >
>> >> > > > > > My understanding is that passed ENV variables are added to
>> the
>> >> > > > > > configuration in the "GlobalConfiguration.loadConfig()"
>> method
>> >> (or
>> >> > > > > > similar).
>> >> > > > > > For all the code inside Flink, it looks like the data was in
>> the
>> >> > > config
>> >> > > > > to
>> >> > > > > > start with, just that the scripts that compute the variables
>> can
>> >> > pass
>> >> > > > the
>> >> > > > > > values to the process without actually needing to write a
>> file.
>> >> > > > > >
>> >> > > > > > For example the "GlobalConfiguration.loadConfig()" method
>> would
>> >> > take
>> >> > > > any
>> >> > > > > > ENV variable prefixed with "flink" and add it as a config
>> key.
>> >> > > > > > "flink_taskmanager_memory_size=2g" would become
>> >> > > > "taskmanager.memory.size:
>> >> > > > > > 2g".
>> >> > > > > >
>> >> > > > > >
>> >> > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song <
>> >> > tonysong...@gmail.com>
>> >> > > > > > wrote:
>> >> > > > > >
>> >> > > > > > > Thanks for the comments, Till.
>> >> > > > > > >
>> >> > > > > > > I've also seen your comments on the wiki page, but let's
>> keep
>> >> the
>> >> > > > > > > discussion here.
>> >> > > > > > >
>> >> > > > > > > - Regarding 'TaskExecutorSpecifics', how do you think about
>> >> > naming
>> >> > > it
>> >> > > > > > > 'TaskExecutorResourceSpecifics'.
>> >> > > > > > > - Regarding passing memory configurations into task
>> executors,
>> >> > I'm
>> >> > > in
>> >> > > > > > favor
>> >> > > > > > > of do it via environment variables rather than
>> configurations,
>> >> > with
>> >> > > > the
>> >> > > > > > > following two reasons.
>> >> > > > > > >   - It is easier to keep the memory options once calculate
>> >> not to
>> >> > > be
>> >> > > > > > > changed with environment variables rather than
>> configurations.
>> >> > > > > > >   - I'm not sure whether we should write the configuration
>> in
>> >> > > startup
>> >> > > > > > > scripts. Writing changes into the configuration files when
>> >> > running
>> >> > > > the
>> >> > > > > > > startup scripts does not sounds right to me. Or we could
>> make
>> >> a
>> >> > > copy
>> >> > > > of
>> >> > > > > > > configuration files per flink cluster, and make the task
>> >> executor
>> >> > > to
>> >> > > > > load
>> >> > > > > > > from the copy, and clean up the copy after the cluster is
>> >> > shutdown,
>> >> > > > > which
>> >> > > > > > > is complicated. (I think this is also what Stephan means in
>> >> his
>> >> > > > comment
>> >> > > > > > on
>> >> > > > > > > the wiki page?)
>> >> > > > > > > - Regarding reserving memory, I think this change should be
>> >> > > included
>> >> > > > in
>> >> > > > > > > this FLIP. I think a big part of motivations of this FLIP
>> is
>> >> to
>> >> > > unify
>> >> > > > > > > memory configuration for streaming / batch and make it easy
>> >> for
>> >> > > > > > configuring
>> >> > > > > > > rocksdb memory. If we don't support memory reservation,
>> then
>> >> > > > streaming
>> >> > > > > > jobs
>> >> > > > > > > cannot use managed memory (neither on-heap or off-heap),
>> which
>> >> > > makes
>> >> > > > > this
>> >> > > > > > > FLIP incomplete.
>> >> > > > > > > - Regarding network memory, I think you are right. I think
>> we
>> >> > > > probably
>> >> > > > > > > don't need to change network stack from using direct
>> memory to
>> >> > > using
>> >> > > > > > unsafe
>> >> > > > > > > native memory. Network memory size is deterministic,
>> cannot be
>> >> > > > reserved
>> >> > > > > > as
>> >> > > > > > > managed memory does, and cannot be overused. I think it
>> also
>> >> > works
>> >> > > if
>> >> > > > > we
>> >> > > > > > > simply keep using direct memory for network and include it
>> in
>> >> jvm
>> >> > > max
>> >> > > > > > > direct memory size.
>> >> > > > > > >
>> >> > > > > > > Thank you~
>> >> > > > > > >
>> >> > > > > > > Xintong Song
>> >> > > > > > >
>> >> > > > > > >
>> >> > > > > > >
>> >> > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann <
>> >> > > trohrm...@apache.org>
>> >> > > > > > > wrote:
>> >> > > > > > >
>> >> > > > > > > > Hi Xintong,
>> >> > > > > > > >
>> >> > > > > > > > thanks for addressing the comments and adding a more
>> >> detailed
>> >> > > > > > > > implementation plan. I have a couple of comments
>> concerning
>> >> the
>> >> > > > > > > > implementation plan:
>> >> > > > > > > >
>> >> > > > > > > > - The name `TaskExecutorSpecifics` is not really
>> >> descriptive.
>> >> > > > > Choosing
>> >> > > > > > a
>> >> > > > > > > > different name could help here.
>> >> > > > > > > > - I'm not sure whether I would pass the memory
>> >> configuration to
>> >> > > the
>> >> > > > > > > > TaskExecutor via environment variables. I think it would
>> be
>> >> > > better
>> >> > > > to
>> >> > > > > > > write
>> >> > > > > > > > it into the configuration one uses to start the TM
>> process.
>> >> > > > > > > > - If possible, I would exclude the memory reservation
>> from
>> >> this
>> >> > > > FLIP
>> >> > > > > > and
>> >> > > > > > > > add this as part of a dedicated FLIP.
>> >> > > > > > > > - If possible, then I would exclude changes to the
>> network
>> >> > stack
>> >> > > > from
>> >> > > > > > > this
>> >> > > > > > > > FLIP. Maybe we can simply say that the direct memory
>> needed
>> >> by
>> >> > > the
>> >> > > > > > > network
>> >> > > > > > > > stack is the framework direct memory requirement.
>> Changing
>> >> how
>> >> > > the
>> >> > > > > > memory
>> >> > > > > > > > is allocated can happen in a second step. This would keep
>> >> the
>> >> > > scope
>> >> > > > > of
>> >> > > > > > > this
>> >> > > > > > > > FLIP smaller.
>> >> > > > > > > >
>> >> > > > > > > > Cheers,
>> >> > > > > > > > Till
>> >> > > > > > > >
>> >> > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song <
>> >> > > > tonysong...@gmail.com>
>> >> > > > > > > > wrote:
>> >> > > > > > > >
>> >> > > > > > > > > Hi everyone,
>> >> > > > > > > > >
>> >> > > > > > > > > I just updated the FLIP document on wiki [1], with the
>> >> > > following
>> >> > > > > > > changes.
>> >> > > > > > > > >
>> >> > > > > > > > >    - Removed open question regarding MemorySegment
>> >> > allocation.
>> >> > > As
>> >> > > > > > > > >    discussed, we exclude this topic from the scope of
>> this
>> >> > > FLIP.
>> >> > > > > > > > >    - Updated content about JVM direct memory parameter
>> >> > > according
>> >> > > > to
>> >> > > > > > > > recent
>> >> > > > > > > > >    discussions, and moved the other options to
>> "Rejected
>> >> > > > > > Alternatives"
>> >> > > > > > > > for
>> >> > > > > > > > > the
>> >> > > > > > > > >    moment.
>> >> > > > > > > > >    - Added implementation steps.
>> >> > > > > > > > >
>> >> > > > > > > > >
>> >> > > > > > > > > Thank you~
>> >> > > > > > > > >
>> >> > > > > > > > > Xintong Song
>> >> > > > > > > > >
>> >> > > > > > > > >
>> >> > > > > > > > > [1]
>> >> > > > > > > > >
>> >> > > > > > > > >
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
>> >> > > > > > > > >
>> >> > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen <
>> >> > se...@apache.org
>> >> > > >
>> >> > > > > > wrote:
>> >> > > > > > > > >
>> >> > > > > > > > > > @Xintong: Concerning "wait for memory users before
>> task
>> >> > > dispose
>> >> > > > > and
>> >> > > > > > > > > memory
>> >> > > > > > > > > > release": I agree, that's how it should be. Let's
>> try it
>> >> > out.
>> >> > > > > > > > > >
>> >> > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not wait
>> for
>> >> GC
>> >> > > when
>> >> > > > > > > > allocating
>> >> > > > > > > > > > direct memory buffer": There seems to be pretty
>> >> elaborate
>> >> > > logic
>> >> > > > > to
>> >> > > > > > > free
>> >> > > > > > > > > > buffers when allocating new ones. See
>> >> > > > > > > > > >
>> >> > > > > > > > > >
>> >> > > > > > > > >
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>> http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643
>> >> > > > > > > > > >
>> >> > > > > > > > > > @Till: Maybe. If we assume that the JVM default works
>> >> (like
>> >> > > > going
>> >> > > > > > > with
>> >> > > > > > > > > > option 2 and not setting "-XX:MaxDirectMemorySize" at
>> >> all),
>> >> > > > then
>> >> > > > > I
>> >> > > > > > > > think
>> >> > > > > > > > > it
>> >> > > > > > > > > > should be okay to set "-XX:MaxDirectMemorySize" to
>> >> > > > > > > > > > "off_heap_managed_memory + direct_memory" even if we
>> use
>> >> > > > RocksDB.
>> >> > > > > > > That
>> >> > > > > > > > > is a
>> >> > > > > > > > > > big if, though, I honestly have no idea :D Would be
>> >> good to
>> >> > > > > > > understand
>> >> > > > > > > > > > this, though, because this would affect option (2)
>> and
>> >> > option
>> >> > > > > > (1.2).
>> >> > > > > > > > > >
>> >> > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song <
>> >> > > > > > tonysong...@gmail.com>
>> >> > > > > > > > > > wrote:
>> >> > > > > > > > > >
>> >> > > > > > > > > > > Thanks for the inputs, Jingsong.
>> >> > > > > > > > > > >
>> >> > > > > > > > > > > Let me try to summarize your points. Please correct
>> >> me if
>> >> > > I'm
>> >> > > > > > > wrong.
>> >> > > > > > > > > > >
>> >> > > > > > > > > > >    - Memory consumers should always avoid returning
>> >> > memory
>> >> > > > > > segments
>> >> > > > > > > > to
>> >> > > > > > > > > > >    memory manager while there are still un-cleaned
>> >> > > > structures /
>> >> > > > > > > > threads
>> >> > > > > > > > > > > that
>> >> > > > > > > > > > >    may use the memory. Otherwise, it would cause
>> >> serious
>> >> > > > > problems
>> >> > > > > > > by
>> >> > > > > > > > > > having
>> >> > > > > > > > > > >    multiple consumers trying to use the same memory
>> >> > > segment.
>> >> > > > > > > > > > >    - JVM does not wait for GC when allocating
>> direct
>> >> > memory
>> >> > > > > > buffer.
>> >> > > > > > > > > > >    Therefore even we set proper max direct memory
>> size
>> >> > > limit,
>> >> > > > > we
>> >> > > > > > > may
>> >> > > > > > > > > > still
>> >> > > > > > > > > > >    encounter direct memory oom if the GC cleaning
>> >> memory
>> >> > > > slower
>> >> > > > > > > than
>> >> > > > > > > > > the
>> >> > > > > > > > > > >    direct memory allocation.
>> >> > > > > > > > > > >
>> >> > > > > > > > > > > Am I understanding this correctly?
>> >> > > > > > > > > > >
>> >> > > > > > > > > > > Thank you~
>> >> > > > > > > > > > >
>> >> > > > > > > > > > > Xintong Song
>> >> > > > > > > > > > >
>> >> > > > > > > > > > >
>> >> > > > > > > > > > >
>> >> > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee <
>> >> > > > > > > lzljs3620...@aliyun.com
>> >> > > > > > > > > > > .invalid>
>> >> > > > > > > > > > > wrote:
>> >> > > > > > > > > > >
>> >> > > > > > > > > > > > Hi stephan:
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > > > About option 2:
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > > > if additional threads not cleanly shut down
>> before
>> >> we
>> >> > can
>> >> > > > > exit
>> >> > > > > > > the
>> >> > > > > > > > > > task:
>> >> > > > > > > > > > > > In the current case of memory reuse, it has
>> freed up
>> >> > the
>> >> > > > > memory
>> >> > > > > > > it
>> >> > > > > > > > > > > >  uses. If this memory is used by other tasks and
>> >> > > > asynchronous
>> >> > > > > > > > threads
>> >> > > > > > > > > > > >  of exited task may still be writing, there will
>> be
>> >> > > > > concurrent
>> >> > > > > > > > > security
>> >> > > > > > > > > > > >  problems, and even lead to errors in user
>> computing
>> >> > > > results.
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > > > So I think this is a serious and intolerable
>> bug, No
>> >> > > matter
>> >> > > > > > what
>> >> > > > > > > > the
>> >> > > > > > > > > > > >  option is, it should be avoided.
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > > > About direct memory cleaned by GC:
>> >> > > > > > > > > > > > I don't think it is a good idea, I've
>> encountered so
>> >> > many
>> >> > > > > > > > situations
>> >> > > > > > > > > > > >  that it's too late for GC to cause DirectMemory
>> >> OOM.
>> >> > > > Release
>> >> > > > > > and
>> >> > > > > > > > > > > >  allocate DirectMemory depend on the type of user
>> >> job,
>> >> > > > which
>> >> > > > > is
>> >> > > > > > > > > > > >  often beyond our control.
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > > > Best,
>> >> > > > > > > > > > > > Jingsong Lee
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > > >
>> >> > > > > > >
>> >> > ------------------------------------------------------------------
>> >> > > > > > > > > > > > From:Stephan Ewen <se...@apache.org>
>> >> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56
>> >> > > > > > > > > > > > To:dev <dev@flink.apache.org>
>> >> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified Memory
>> >> > > Configuration
>> >> > > > > for
>> >> > > > > > > > > > > > TaskExecutors
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > > > My main concern with option 2 (manually release
>> >> memory)
>> >> > > is
>> >> > > > > that
>> >> > > > > > > > > > segfaults
>> >> > > > > > > > > > > > in the JVM send off all sorts of alarms on user
>> >> ends.
>> >> > So
>> >> > > we
>> >> > > > > > need
>> >> > > > > > > to
>> >> > > > > > > > > > > > guarantee that this never happens.
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > > > The trickyness is in tasks that uses data
>> >> structures /
>> >> > > > > > algorithms
>> >> > > > > > > > > with
>> >> > > > > > > > > > > > additional threads, like hash table spill/read
>> and
>> >> > > sorting
>> >> > > > > > > threads.
>> >> > > > > > > > > We
>> >> > > > > > > > > > > need
>> >> > > > > > > > > > > > to ensure that these cleanly shut down before we
>> can
>> >> > exit
>> >> > > > the
>> >> > > > > > > task.
>> >> > > > > > > > > > > > I am not sure that we have that guaranteed
>> already,
>> >> > > that's
>> >> > > > > why
>> >> > > > > > > > option
>> >> > > > > > > > > > 1.1
>> >> > > > > > > > > > > > seemed simpler to me.
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong Song <
>> >> > > > > > > > tonysong...@gmail.com>
>> >> > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > > > > Thanks for the comments, Stephan. Summarized in
>> >> this
>> >> > > way
>> >> > > > > > really
>> >> > > > > > > > > makes
>> >> > > > > > > > > > > > > things easier to understand.
>> >> > > > > > > > > > > > >
>> >> > > > > > > > > > > > > I'm in favor of option 2, at least for the
>> >> moment. I
>> >> > > > think
>> >> > > > > it
>> >> > > > > > > is
>> >> > > > > > > > > not
>> >> > > > > > > > > > > that
>> >> > > > > > > > > > > > > difficult to keep it segfault safe for memory
>> >> > manager,
>> >> > > as
>> >> > > > > > long
>> >> > > > > > > as
>> >> > > > > > > > > we
>> >> > > > > > > > > > > > always
>> >> > > > > > > > > > > > > de-allocate the memory segment when it is
>> released
>> >> > from
>> >> > > > the
>> >> > > > > > > > memory
>> >> > > > > > > > > > > > > consumers. Only if the memory consumer continue
>> >> using
>> >> > > the
>> >> > > > > > > buffer
>> >> > > > > > > > of
>> >> > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > segment after releasing it, in which case we do
>> >> want
>> >> > > the
>> >> > > > > job
>> >> > > > > > to
>> >> > > > > > > > > fail
>> >> > > > > > > > > > so
>> >> > > > > > > > > > > > we
>> >> > > > > > > > > > > > > detect the memory leak early.
>> >> > > > > > > > > > > > >
>> >> > > > > > > > > > > > > For option 1.2, I don't think this is a good
>> idea.
>> >> > Not
>> >> > > > only
>> >> > > > > > > > because
>> >> > > > > > > > > > the
>> >> > > > > > > > > > > > > assumption (regular GC is enough to clean
>> direct
>> >> > > buffers)
>> >> > > > > may
>> >> > > > > > > not
>> >> > > > > > > > > > > always
>> >> > > > > > > > > > > > be
>> >> > > > > > > > > > > > > true, but also it makes harder for finding
>> >> problems
>> >> > in
>> >> > > > > cases
>> >> > > > > > of
>> >> > > > > > > > > > memory
>> >> > > > > > > > > > > > > overuse. E.g., user configured some direct
>> memory
>> >> for
>> >> > > the
>> >> > > > > > user
>> >> > > > > > > > > > > libraries.
>> >> > > > > > > > > > > > > If the library actually use more direct memory
>> >> then
>> >> > > > > > configured,
>> >> > > > > > > > > which
>> >> > > > > > > > > > > > > cannot be cleaned by GC because they are still
>> in
>> >> > use,
>> >> > > > may
>> >> > > > > > lead
>> >> > > > > > > > to
>> >> > > > > > > > > > > > overuse
>> >> > > > > > > > > > > > > of the total container memory. In that case,
>> if it
>> >> > > didn't
>> >> > > > > > touch
>> >> > > > > > > > the
>> >> > > > > > > > > > JVM
>> >> > > > > > > > > > > > > default max direct memory limit, we cannot get
>> a
>> >> > direct
>> >> > > > > > memory
>> >> > > > > > > > OOM
>> >> > > > > > > > > > and
>> >> > > > > > > > > > > it
>> >> > > > > > > > > > > > > will become super hard to understand which
>> part of
>> >> > the
>> >> > > > > > > > > configuration
>> >> > > > > > > > > > > need
>> >> > > > > > > > > > > > > to be updated.
>> >> > > > > > > > > > > > >
>> >> > > > > > > > > > > > > For option 1.1, it has the similar problem as
>> >> 1.2, if
>> >> > > the
>> >> > > > > > > > exceeded
>> >> > > > > > > > > > > direct
>> >> > > > > > > > > > > > > memory does not reach the max direct memory
>> limit
>> >> > > > specified
>> >> > > > > > by
>> >> > > > > > > > the
>> >> > > > > > > > > > > > > dedicated parameter. I think it is slightly
>> better
>> >> > than
>> >> > > > > 1.2,
>> >> > > > > > > only
>> >> > > > > > > > > > > because
>> >> > > > > > > > > > > > > we can tune the parameter.
>> >> > > > > > > > > > > > >
>> >> > > > > > > > > > > > > Thank you~
>> >> > > > > > > > > > > > >
>> >> > > > > > > > > > > > > Xintong Song
>> >> > > > > > > > > > > > >
>> >> > > > > > > > > > > > >
>> >> > > > > > > > > > > > >
>> >> > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan Ewen <
>> >> > > > > > se...@apache.org
>> >> > > > > > > >
>> >> > > > > > > > > > wrote:
>> >> > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize"
>> discussion,
>> >> > maybe
>> >> > > > let
>> >> > > > > > me
>> >> > > > > > > > > > > summarize
>> >> > > > > > > > > > > > > it a
>> >> > > > > > > > > > > > > > bit differently:
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > We have the following two options:
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > (1) We let MemorySegments be de-allocated by
>> the
>> >> > GC.
>> >> > > > That
>> >> > > > > > > makes
>> >> > > > > > > > > it
>> >> > > > > > > > > > > > > segfault
>> >> > > > > > > > > > > > > > safe. But then we need a way to trigger GC in
>> >> case
>> >> > > > > > > > de-allocation
>> >> > > > > > > > > > and
>> >> > > > > > > > > > > > > > re-allocation of a bunch of segments happens
>> >> > quickly,
>> >> > > > > which
>> >> > > > > > > is
>> >> > > > > > > > > > often
>> >> > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > case during batch scheduling or task restart.
>> >> > > > > > > > > > > > > >   - The "-XX:MaxDirectMemorySize" (option
>> 1.1)
>> >> is
>> >> > one
>> >> > > > way
>> >> > > > > > to
>> >> > > > > > > do
>> >> > > > > > > > > > this
>> >> > > > > > > > > > > > > >   - Another way could be to have a dedicated
>> >> > > > bookkeeping
>> >> > > > > in
>> >> > > > > > > the
>> >> > > > > > > > > > > > > > MemoryManager (option 1.2), so that this is a
>> >> > number
>> >> > > > > > > > independent
>> >> > > > > > > > > of
>> >> > > > > > > > > > > the
>> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter.
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > (2) We manually allocate and de-allocate the
>> >> memory
>> >> > > for
>> >> > > > > the
>> >> > > > > > > > > > > > > MemorySegments
>> >> > > > > > > > > > > > > > (option 2). That way we need not worry about
>> >> > > triggering
>> >> > > > > GC
>> >> > > > > > by
>> >> > > > > > > > > some
>> >> > > > > > > > > > > > > > threshold or bookkeeping, but it is harder to
>> >> > prevent
>> >> > > > > > > > segfaults.
>> >> > > > > > > > > We
>> >> > > > > > > > > > > > need
>> >> > > > > > > > > > > > > to
>> >> > > > > > > > > > > > > > be very careful about when we release the
>> memory
>> >> > > > segments
>> >> > > > > > > (only
>> >> > > > > > > > > in
>> >> > > > > > > > > > > the
>> >> > > > > > > > > > > > > > cleanup phase of the main thread).
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > If we go with option 1.1, we probably need to
>> >> set
>> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to
>> >> > > "off_heap_managed_memory +
>> >> > > > > > > > > > > direct_memory"
>> >> > > > > > > > > > > > > and
>> >> > > > > > > > > > > > > > have "direct_memory" as a separate reserved
>> >> memory
>> >> > > > pool.
>> >> > > > > > > > Because
>> >> > > > > > > > > if
>> >> > > > > > > > > > > we
>> >> > > > > > > > > > > > > just
>> >> > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to
>> >> > > > > "off_heap_managed_memory +
>> >> > > > > > > > > > > > > jvm_overhead",
>> >> > > > > > > > > > > > > > then there will be times when that entire
>> >> memory is
>> >> > > > > > allocated
>> >> > > > > > > > by
>> >> > > > > > > > > > > direct
>> >> > > > > > > > > > > > > > buffers and we have nothing left for the JVM
>> >> > > overhead.
>> >> > > > So
>> >> > > > > > we
>> >> > > > > > > > > either
>> >> > > > > > > > > > > > need
>> >> > > > > > > > > > > > > a
>> >> > > > > > > > > > > > > > way to compensate for that (again some safety
>> >> > margin
>> >> > > > > cutoff
>> >> > > > > > > > > value)
>> >> > > > > > > > > > or
>> >> > > > > > > > > > > > we
>> >> > > > > > > > > > > > > > will exceed container memory.
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > If we go with option 1.2, we need to be aware
>> >> that
>> >> > it
>> >> > > > > takes
>> >> > > > > > > > > > elaborate
>> >> > > > > > > > > > > > > logic
>> >> > > > > > > > > > > > > > to push recycling of direct buffers without
>> >> always
>> >> > > > > > > triggering a
>> >> > > > > > > > > > full
>> >> > > > > > > > > > > > GC.
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > My first guess is that the options will be
>> >> easiest
>> >> > to
>> >> > > > do
>> >> > > > > in
>> >> > > > > > > the
>> >> > > > > > > > > > > > following
>> >> > > > > > > > > > > > > > order:
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > >   - Option 1.1 with a dedicated direct_memory
>> >> > > > parameter,
>> >> > > > > as
>> >> > > > > > > > > > discussed
>> >> > > > > > > > > > > > > > above. We would need to find a way to set the
>> >> > > > > direct_memory
>> >> > > > > > > > > > parameter
>> >> > > > > > > > > > > > by
>> >> > > > > > > > > > > > > > default. We could start with 64 MB and see
>> how
>> >> it
>> >> > > goes
>> >> > > > in
>> >> > > > > > > > > practice.
>> >> > > > > > > > > > > One
>> >> > > > > > > > > > > > > > danger I see is that setting this loo low can
>> >> > cause a
>> >> > > > > bunch
>> >> > > > > > > of
>> >> > > > > > > > > > > > additional
>> >> > > > > > > > > > > > > > GCs compared to before (we need to watch this
>> >> > > > carefully).
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > >   - Option 2. It is actually quite simple to
>> >> > > implement,
>> >> > > > > we
>> >> > > > > > > > could
>> >> > > > > > > > > > try
>> >> > > > > > > > > > > > how
>> >> > > > > > > > > > > > > > segfault safe we are at the moment.
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > >   - Option 1.2: We would not touch the
>> >> > > > > > > > "-XX:MaxDirectMemorySize"
>> >> > > > > > > > > > > > > parameter
>> >> > > > > > > > > > > > > > at all and assume that all the direct memory
>> >> > > > allocations
>> >> > > > > > that
>> >> > > > > > > > the
>> >> > > > > > > > > > JVM
>> >> > > > > > > > > > > > and
>> >> > > > > > > > > > > > > > Netty do are infrequent enough to be cleaned
>> up
>> >> > fast
>> >> > > > > enough
>> >> > > > > > > > > through
>> >> > > > > > > > > > > > > regular
>> >> > > > > > > > > > > > > > GC. I am not sure if that is a valid
>> assumption,
>> >> > > > though.
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > Best,
>> >> > > > > > > > > > > > > > Stephan
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song
>> <
>> >> > > > > > > > > > tonysong...@gmail.com>
>> >> > > > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till.
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was
>> >> > wondering
>> >> > > > > > whether
>> >> > > > > > > > we
>> >> > > > > > > > > > can
>> >> > > > > > > > > > > > > avoid
>> >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap
>> managed
>> >> > memory
>> >> > > > and
>> >> > > > > > > > network
>> >> > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > with
>> >> > > > > > > > > > > > > > > alternative 3. But after giving it a second
>> >> > > thought,
>> >> > > > I
>> >> > > > > > > think
>> >> > > > > > > > > even
>> >> > > > > > > > > > > for
>> >> > > > > > > > > > > > > > > alternative 3 using direct memory for
>> off-heap
>> >> > > > managed
>> >> > > > > > > memory
>> >> > > > > > > > > > could
>> >> > > > > > > > > > > > > cause
>> >> > > > > > > > > > > > > > > problems.
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > Hi Yang,
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > Regarding your concern, I think what
>> proposed
>> >> in
>> >> > > this
>> >> > > > > > FLIP
>> >> > > > > > > it
>> >> > > > > > > > > to
>> >> > > > > > > > > > > have
>> >> > > > > > > > > > > > > > both
>> >> > > > > > > > > > > > > > > off-heap managed memory and network memory
>> >> > > allocated
>> >> > > > > > > through
>> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are
>> >> > practically
>> >> > > > > > native
>> >> > > > > > > > > memory
>> >> > > > > > > > > > > and
>> >> > > > > > > > > > > > > not
>> >> > > > > > > > > > > > > > > limited by JVM max direct memory. The only
>> >> parts
>> >> > of
>> >> > > > > > memory
>> >> > > > > > > > > > limited
>> >> > > > > > > > > > > by
>> >> > > > > > > > > > > > > JVM
>> >> > > > > > > > > > > > > > > max direct memory are task off-heap memory
>> and
>> >> > JVM
>> >> > > > > > > overhead,
>> >> > > > > > > > > > which
>> >> > > > > > > > > > > > are
>> >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set the
>> JVM
>> >> max
>> >> > > > > direct
>> >> > > > > > > > memory
>> >> > > > > > > > > > to.
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > Thank you~
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > Xintong Song
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till
>> Rohrmann
>> >> <
>> >> > > > > > > > > > > trohrm...@apache.org>
>> >> > > > > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I
>> >> > > understand
>> >> > > > > the
>> >> > > > > > > two
>> >> > > > > > > > > > > > > alternatives
>> >> > > > > > > > > > > > > > > > now.
>> >> > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > I would be in favour of option 2 because
>> it
>> >> > makes
>> >> > > > > > things
>> >> > > > > > > > > > > explicit.
>> >> > > > > > > > > > > > If
>> >> > > > > > > > > > > > > > we
>> >> > > > > > > > > > > > > > > > don't limit the direct memory, I fear
>> that
>> >> we
>> >> > > might
>> >> > > > > end
>> >> > > > > > > up
>> >> > > > > > > > > in a
>> >> > > > > > > > > > > > > similar
>> >> > > > > > > > > > > > > > > > situation as we are currently in: The
>> user
>> >> > might
>> >> > > > see
>> >> > > > > > that
>> >> > > > > > > > her
>> >> > > > > > > > > > > > process
>> >> > > > > > > > > > > > > > > gets
>> >> > > > > > > > > > > > > > > > killed by the OS and does not know why
>> this
>> >> is
>> >> > > the
>> >> > > > > > case.
>> >> > > > > > > > > > > > > Consequently,
>> >> > > > > > > > > > > > > > > she
>> >> > > > > > > > > > > > > > > > tries to decrease the process memory size
>> >> > > (similar
>> >> > > > to
>> >> > > > > > > > > > increasing
>> >> > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > cutoff
>> >> > > > > > > > > > > > > > > > ratio) in order to accommodate for the
>> extra
>> >> > > direct
>> >> > > > > > > memory.
>> >> > > > > > > > > > Even
>> >> > > > > > > > > > > > > worse,
>> >> > > > > > > > > > > > > > > she
>> >> > > > > > > > > > > > > > > > tries to decrease memory budgets which
>> are
>> >> not
>> >> > > > fully
>> >> > > > > > used
>> >> > > > > > > > and
>> >> > > > > > > > > > > hence
>> >> > > > > > > > > > > > > > won't
>> >> > > > > > > > > > > > > > > > change the overall memory consumption.
>> >> > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > Cheers,
>> >> > > > > > > > > > > > > > > > Till
>> >> > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong
>> >> Song <
>> >> > > > > > > > > > > > tonysong...@gmail.com
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > Let me explain this with a concrete
>> >> example
>> >> > > Till.
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > Let's say we have the following
>> scenario.
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB
>> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap
>> Memory +
>> >> JVM
>> >> > > > > > > Overhead):
>> >> > > > > > > > > > 200MB
>> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM
>> >> Metaspace,
>> >> > > > > > Off-Heap
>> >> > > > > > > > > > Managed
>> >> > > > > > > > > > > > > Memory
>> >> > > > > > > > > > > > > > > and
>> >> > > > > > > > > > > > > > > > > Network Memory): 800MB
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > For alternative 2, we set
>> >> > > -XX:MaxDirectMemorySize
>> >> > > > > to
>> >> > > > > > > > 200MB.
>> >> > > > > > > > > > > > > > > > > For alternative 3, we set
>> >> > > -XX:MaxDirectMemorySize
>> >> > > > > to
>> >> > > > > > a
>> >> > > > > > > > very
>> >> > > > > > > > > > > large
>> >> > > > > > > > > > > > > > > value,
>> >> > > > > > > > > > > > > > > > > let's say 1TB.
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > If the actual direct memory usage of
>> Task
>> >> > > > Off-Heap
>> >> > > > > > > Memory
>> >> > > > > > > > > and
>> >> > > > > > > > > > > JVM
>> >> > > > > > > > > > > > > > > > Overhead
>> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2
>> >> and
>> >> > > > > > > alternative 3
>> >> > > > > > > > > > > should
>> >> > > > > > > > > > > > > have
>> >> > > > > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > > > same utility. Setting larger
>> >> > > > > -XX:MaxDirectMemorySize
>> >> > > > > > > will
>> >> > > > > > > > > not
>> >> > > > > > > > > > > > > reduce
>> >> > > > > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > > > sizes of the other memory pools.
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > If the actual direct memory usage of
>> Task
>> >> > > > Off-Heap
>> >> > > > > > > Memory
>> >> > > > > > > > > and
>> >> > > > > > > > > > > JVM
>> >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > >    - Alternative 2 suffers from
>> frequent
>> >> OOM.
>> >> > > To
>> >> > > > > > avoid
>> >> > > > > > > > > that,
>> >> > > > > > > > > > > the
>> >> > > > > > > > > > > > > only
>> >> > > > > > > > > > > > > > > > thing
>> >> > > > > > > > > > > > > > > > >    user can do is to modify the
>> >> configuration
>> >> > > and
>> >> > > > > > > > increase
>> >> > > > > > > > > > JVM
>> >> > > > > > > > > > > > > Direct
>> >> > > > > > > > > > > > > > > > > Memory
>> >> > > > > > > > > > > > > > > > >    (Task Off-Heap Memory + JVM
>> Overhead).
>> >> > Let's
>> >> > > > say
>> >> > > > > > > that
>> >> > > > > > > > > user
>> >> > > > > > > > > > > > > > increases
>> >> > > > > > > > > > > > > > > > JVM
>> >> > > > > > > > > > > > > > > > >    Direct Memory to 250MB, this will
>> >> reduce
>> >> > the
>> >> > > > > total
>> >> > > > > > > > size
>> >> > > > > > > > > of
>> >> > > > > > > > > > > > other
>> >> > > > > > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > > >    pools to 750MB, given the total
>> process
>> >> > > memory
>> >> > > > > > > remains
>> >> > > > > > > > > > 1GB.
>> >> > > > > > > > > > > > > > > > >    - For alternative 3, there is no
>> >> chance of
>> >> > > > > direct
>> >> > > > > > > OOM.
>> >> > > > > > > > > > There
>> >> > > > > > > > > > > > are
>> >> > > > > > > > > > > > > > > > chances
>> >> > > > > > > > > > > > > > > > >    of exceeding the total process
>> memory
>> >> > limit,
>> >> > > > but
>> >> > > > > > > given
>> >> > > > > > > > > > that
>> >> > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > > process
>> >> > > > > > > > > > > > > > > > > may
>> >> > > > > > > > > > > > > > > > >    not use up all the reserved native
>> >> memory
>> >> > > > > > (Off-Heap
>> >> > > > > > > > > > Managed
>> >> > > > > > > > > > > > > > Memory,
>> >> > > > > > > > > > > > > > > > > Network
>> >> > > > > > > > > > > > > > > > >    Memory, JVM Metaspace), if the
>> actual
>> >> > direct
>> >> > > > > > memory
>> >> > > > > > > > > usage
>> >> > > > > > > > > > is
>> >> > > > > > > > > > > > > > > slightly
>> >> > > > > > > > > > > > > > > > > above
>> >> > > > > > > > > > > > > > > > >    yet very close to 200MB, user
>> probably
>> >> do
>> >> > > not
>> >> > > > > need
>> >> > > > > > > to
>> >> > > > > > > > > > change
>> >> > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > > >    configurations.
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > Therefore, I think from the user's
>> >> > > perspective, a
>> >> > > > > > > > feasible
>> >> > > > > > > > > > > > > > > configuration
>> >> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower
>> >> resource
>> >> > > > > > > utilization
>> >> > > > > > > > > > > compared
>> >> > > > > > > > > > > > > to
>> >> > > > > > > > > > > > > > > > > alternative 3.
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > Thank you~
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > Xintong Song
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till
>> >> > Rohrmann
>> >> > > <
>> >> > > > > > > > > > > > > trohrm...@apache.org
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > I guess you have to help me
>> understand
>> >> the
>> >> > > > > > difference
>> >> > > > > > > > > > between
>> >> > > > > > > > > > > > > > > > > alternative 2
>> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under utilization
>> >> > > Xintong.
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > - Alternative 2: set
>> >> XX:MaxDirectMemorySize
>> >> > > to
>> >> > > > > Task
>> >> > > > > > > > > > Off-Heap
>> >> > > > > > > > > > > > > Memory
>> >> > > > > > > > > > > > > > > and
>> >> > > > > > > > > > > > > > > > > JVM
>> >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk that
>> >> this
>> >> > > size
>> >> > > > > is
>> >> > > > > > > too
>> >> > > > > > > > > low
>> >> > > > > > > > > > > > > > resulting
>> >> > > > > > > > > > > > > > > > in a
>> >> > > > > > > > > > > > > > > > > > lot of garbage collection and
>> >> potentially
>> >> > an
>> >> > > > OOM.
>> >> > > > > > > > > > > > > > > > > > - Alternative 3: set
>> >> XX:MaxDirectMemorySize
>> >> > > to
>> >> > > > > > > > something
>> >> > > > > > > > > > > larger
>> >> > > > > > > > > > > > > > than
>> >> > > > > > > > > > > > > > > > > > alternative 2. This would of course
>> >> reduce
>> >> > > the
>> >> > > > > > sizes
>> >> > > > > > > of
>> >> > > > > > > > > the
>> >> > > > > > > > > > > > other
>> >> > > > > > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > > > > types.
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > How would alternative 2 now result
>> in an
>> >> > > under
>> >> > > > > > > > > utilization
>> >> > > > > > > > > > of
>> >> > > > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > > > > compared to alternative 3? If
>> >> alternative 3
>> >> > > > > > strictly
>> >> > > > > > > > > sets a
>> >> > > > > > > > > > > > > higher
>> >> > > > > > > > > > > > > > > max
>> >> > > > > > > > > > > > > > > > > > direct memory size and we use only
>> >> little,
>> >> > > > then I
>> >> > > > > > > would
>> >> > > > > > > > > > > expect
>> >> > > > > > > > > > > > > that
>> >> > > > > > > > > > > > > > > > > > alternative 3 results in memory under
>> >> > > > > utilization.
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > Cheers,
>> >> > > > > > > > > > > > > > > > > > Till
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang
>> >> Wang <
>> >> > > > > > > > > > > > danrtsey...@gmail.com
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > Hi xintong,till
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > My point is setting a very large
>> max
>> >> > direct
>> >> > > > > > memory
>> >> > > > > > > > size
>> >> > > > > > > > > > > when
>> >> > > > > > > > > > > > we
>> >> > > > > > > > > > > > > > do
>> >> > > > > > > > > > > > > > > > not
>> >> > > > > > > > > > > > > > > > > > > differentiate direct and native
>> >> memory.
>> >> > If
>> >> > > > the
>> >> > > > > > > direct
>> >> > > > > > > > > > > > > > > > memory,including
>> >> > > > > > > > > > > > > > > > > > user
>> >> > > > > > > > > > > > > > > > > > > direct memory and framework direct
>> >> > > > memory,could
>> >> > > > > > be
>> >> > > > > > > > > > > calculated
>> >> > > > > > > > > > > > > > > > > > > correctly,then
>> >> > > > > > > > > > > > > > > > > > > i am in favor of setting direct
>> memory
>> >> > with
>> >> > > > > fixed
>> >> > > > > > > > > value.
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > Memory Calculation
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and
>> >> k8s,we
>> >> > > > need
>> >> > > > > to
>> >> > > > > > > > check
>> >> > > > > > > > > > the
>> >> > > > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > > > > > configurations in client to avoid
>> >> > > submitting
>> >> > > > > > > > > successfully
>> >> > > > > > > > > > > and
>> >> > > > > > > > > > > > > > > failing
>> >> > > > > > > > > > > > > > > > > in
>> >> > > > > > > > > > > > > > > > > > > the flink master.
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > Best,
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > Yang
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > Xintong Song <
>> tonysong...@gmail.com
>> >> > > > > >于2019年8月13日
>> >> > > > > > > > > > 周二22:07写道：
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till.
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you
>> are
>> >> > > right
>> >> > > > > that
>> >> > > > > > > we
>> >> > > > > > > > > > should
>> >> > > > > > > > > > > > not
>> >> > > > > > > > > > > > > > > > include
>> >> > > > > > > > > > > > > > > > > > > this
>> >> > > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP.
>> >> This
>> >> > > FLIP
>> >> > > > > > should
>> >> > > > > > > > > > > > concentrate
>> >> > > > > > > > > > > > > > on
>> >> > > > > > > > > > > > > > > > how
>> >> > > > > > > > > > > > > > > > > to
>> >> > > > > > > > > > > > > > > > > > > > configure memory pools for
>> >> > TaskExecutors,
>> >> > > > > with
>> >> > > > > > > > > minimum
>> >> > > > > > > > > > > > > > > involvement
>> >> > > > > > > > > > > > > > > > on
>> >> > > > > > > > > > > > > > > > > > how
>> >> > > > > > > > > > > > > > > > > > > > memory consumers use it.
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > About direct memory, I think
>> >> > alternative
>> >> > > 3
>> >> > > > > may
>> >> > > > > > > not
>> >> > > > > > > > > > having
>> >> > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > same
>> >> > > > > > > > > > > > > > > > > over
>> >> > > > > > > > > > > > > > > > > > > > reservation issue that
>> alternative 2
>> >> > > does,
>> >> > > > > but
>> >> > > > > > at
>> >> > > > > > > > the
>> >> > > > > > > > > > > cost
>> >> > > > > > > > > > > > of
>> >> > > > > > > > > > > > > > > risk
>> >> > > > > > > > > > > > > > > > of
>> >> > > > > > > > > > > > > > > > > > > over
>> >> > > > > > > > > > > > > > > > > > > > using memory at the container
>> level,
>> >> > > which
>> >> > > > is
>> >> > > > > > not
>> >> > > > > > > > > good.
>> >> > > > > > > > > > > My
>> >> > > > > > > > > > > > > > point
>> >> > > > > > > > > > > > > > > is
>> >> > > > > > > > > > > > > > > > > > that
>> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and
>> "JVM
>> >> > > > > Overhead"
>> >> > > > > > > are
>> >> > > > > > > > > not
>> >> > > > > > > > > > > easy
>> >> > > > > > > > > > > > > to
>> >> > > > > > > > > > > > > > > > > config.
>> >> > > > > > > > > > > > > > > > > > > For
>> >> > > > > > > > > > > > > > > > > > > > alternative 2, users might
>> configure
>> >> > them
>> >> > > > > > higher
>> >> > > > > > > > than
>> >> > > > > > > > > > > what
>> >> > > > > > > > > > > > > > > actually
>> >> > > > > > > > > > > > > > > > > > > needed,
>> >> > > > > > > > > > > > > > > > > > > > just to avoid getting a direct
>> OOM.
>> >> For
>> >> > > > > > > alternative
>> >> > > > > > > > > 3,
>> >> > > > > > > > > > > > users
>> >> > > > > > > > > > > > > do
>> >> > > > > > > > > > > > > > > not
>> >> > > > > > > > > > > > > > > > > get
>> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not
>> config
>> >> the
>> >> > > two
>> >> > > > > > > options
>> >> > > > > > > > > > > > > aggressively
>> >> > > > > > > > > > > > > > > > high.
>> >> > > > > > > > > > > > > > > > > > But
>> >> > > > > > > > > > > > > > > > > > > > the consequences are risks of
>> >> overall
>> >> > > > > container
>> >> > > > > > > > > memory
>> >> > > > > > > > > > > > usage
>> >> > > > > > > > > > > > > > > > exceeds
>> >> > > > > > > > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > > > > > > budget.
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > Thank you~
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > Xintong Song
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM
>> Till
>> >> > > > > Rohrmann <
>> >> > > > > > > > > > > > > > > > trohrm...@apache.org>
>> >> > > > > > > > > > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP
>> >> > Xintong.
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > All in all I think it already
>> >> looks
>> >> > > quite
>> >> > > > > > good.
>> >> > > > > > > > > > > > Concerning
>> >> > > > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > > > first
>> >> > > > > > > > > > > > > > > > > > > open
>> >> > > > > > > > > > > > > > > > > > > > > question about allocating
>> memory
>> >> > > > segments,
>> >> > > > > I
>> >> > > > > > > was
>> >> > > > > > > > > > > > wondering
>> >> > > > > > > > > > > > > > > > whether
>> >> > > > > > > > > > > > > > > > > > this
>> >> > > > > > > > > > > > > > > > > > > > is
>> >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in the
>> >> > context
>> >> > > > of
>> >> > > > > > this
>> >> > > > > > > > > FLIP
>> >> > > > > > > > > > or
>> >> > > > > > > > > > > > > > whether
>> >> > > > > > > > > > > > > > > > > this
>> >> > > > > > > > > > > > > > > > > > > > could
>> >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? Without
>> >> > knowing
>> >> > > > all
>> >> > > > > > > > > details,
>> >> > > > > > > > > > I
>> >> > > > > > > > > > > > > would
>> >> > > > > > > > > > > > > > be
>> >> > > > > > > > > > > > > > > > > > > concerned
>> >> > > > > > > > > > > > > > > > > > > > > that we would widen the scope
>> of
>> >> this
>> >> > > > FLIP
>> >> > > > > > too
>> >> > > > > > > > much
>> >> > > > > > > > > > > > because
>> >> > > > > > > > > > > > > > we
>> >> > > > > > > > > > > > > > > > > would
>> >> > > > > > > > > > > > > > > > > > > have
>> >> > > > > > > > > > > > > > > > > > > > > to touch all the existing call
>> >> sites
>> >> > of
>> >> > > > the
>> >> > > > > > > > > > > MemoryManager
>> >> > > > > > > > > > > > > > where
>> >> > > > > > > > > > > > > > > > we
>> >> > > > > > > > > > > > > > > > > > > > allocate
>> >> > > > > > > > > > > > > > > > > > > > > memory segments (this should
>> >> mainly
>> >> > be
>> >> > > > > batch
>> >> > > > > > > > > > > operators).
>> >> > > > > > > > > > > > > The
>> >> > > > > > > > > > > > > > > > > addition
>> >> > > > > > > > > > > > > > > > > > > of
>> >> > > > > > > > > > > > > > > > > > > > > the memory reservation call to
>> the
>> >> > > > > > > MemoryManager
>> >> > > > > > > > > > should
>> >> > > > > > > > > > > > not
>> >> > > > > > > > > > > > > > be
>> >> > > > > > > > > > > > > > > > > > affected
>> >> > > > > > > > > > > > > > > > > > > > by
>> >> > > > > > > > > > > > > > > > > > > > > this and I would hope that
>> this is
>> >> > the
>> >> > > > only
>> >> > > > > > > point
>> >> > > > > > > > > of
>> >> > > > > > > > > > > > > > > interaction
>> >> > > > > > > > > > > > > > > > a
>> >> > > > > > > > > > > > > > > > > > > > > streaming job would have with
>> the
>> >> > > > > > > MemoryManager.
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > Concerning the second open
>> >> question
>> >> > > about
>> >> > > > > > > setting
>> >> > > > > > > > > or
>> >> > > > > > > > > > > not
>> >> > > > > > > > > > > > > > > setting
>> >> > > > > > > > > > > > > > > > a
>> >> > > > > > > > > > > > > > > > > > max
>> >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I would
>> also
>> >> be
>> >> > > > > > interested
>> >> > > > > > > > why
>> >> > > > > > > > > > > Yang
>> >> > > > > > > > > > > > > Wang
>> >> > > > > > > > > > > > > > > > > thinks
>> >> > > > > > > > > > > > > > > > > > > > > leaving it open would be best.
>> My
>> >> > > concern
>> >> > > > > > about
>> >> > > > > > > > > this
>> >> > > > > > > > > > > > would
>> >> > > > > > > > > > > > > be
>> >> > > > > > > > > > > > > > > > that
>> >> > > > > > > > > > > > > > > > > we
>> >> > > > > > > > > > > > > > > > > > > > would
>> >> > > > > > > > > > > > > > > > > > > > > be in a similar situation as we
>> >> are
>> >> > now
>> >> > > > > with
>> >> > > > > > > the
>> >> > > > > > > > > > > > > > > > > RocksDBStateBackend.
>> >> > > > > > > > > > > > > > > > > > > If
>> >> > > > > > > > > > > > > > > > > > > > > the different memory pools are
>> not
>> >> > > > clearly
>> >> > > > > > > > > separated
>> >> > > > > > > > > > > and
>> >> > > > > > > > > > > > > can
>> >> > > > > > > > > > > > > > > > spill
>> >> > > > > > > > > > > > > > > > > > over
>> >> > > > > > > > > > > > > > > > > > > > to
>> >> > > > > > > > > > > > > > > > > > > > > a different pool, then it is
>> quite
>> >> > hard
>> >> > > > to
>> >> > > > > > > > > understand
>> >> > > > > > > > > > > > what
>> >> > > > > > > > > > > > > > > > exactly
>> >> > > > > > > > > > > > > > > > > > > > causes a
>> >> > > > > > > > > > > > > > > > > > > > > process to get killed for using
>> >> too
>> >> > > much
>> >> > > > > > > memory.
>> >> > > > > > > > > This
>> >> > > > > > > > > > > > could
>> >> > > > > > > > > > > > > > > then
>> >> > > > > > > > > > > > > > > > > > easily
>> >> > > > > > > > > > > > > > > > > > > > > lead to a similar situation
>> what
>> >> we
>> >> > > have
>> >> > > > > with
>> >> > > > > > > the
>> >> > > > > > > > > > > > > > cutoff-ratio.
>> >> > > > > > > > > > > > > > > > So
>> >> > > > > > > > > > > > > > > > > > why
>> >> > > > > > > > > > > > > > > > > > > > not
>> >> > > > > > > > > > > > > > > > > > > > > setting a sane default value
>> for
>> >> max
>> >> > > > direct
>> >> > > > > > > > memory
>> >> > > > > > > > > > and
>> >> > > > > > > > > > > > > giving
>> >> > > > > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > > > > user
>> >> > > > > > > > > > > > > > > > > > > an
>> >> > > > > > > > > > > > > > > > > > > > > option to increase it if he
>> runs
>> >> into
>> >> > > an
>> >> > > > > OOM.
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would
>> alternative 2
>> >> > lead
>> >> > > to
>> >> > > > > > lower
>> >> > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > > utilization
>> >> > > > > > > > > > > > > > > > > > than
>> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set the
>> >> direct
>> >> > > > > memory
>> >> > > > > > > to a
>> >> > > > > > > > > > > higher
>> >> > > > > > > > > > > > > > value?
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > Cheers,
>> >> > > > > > > > > > > > > > > > > > > > > Till
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM
>> >> > Xintong
>> >> > > > > Song <
>> >> > > > > > > > > > > > > > > > tonysong...@gmail.com
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback,
>> Yang.
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments:
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory*
>> >> > > > > > > > > > > > > > > > > > > > > > I think setting a very large
>> max
>> >> > > direct
>> >> > > > > > > memory
>> >> > > > > > > > > size
>> >> > > > > > > > > > > > > > > definitely
>> >> > > > > > > > > > > > > > > > > has
>> >> > > > > > > > > > > > > > > > > > > some
>> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not
>> >> worry
>> >> > > about
>> >> > > > > > > direct
>> >> > > > > > > > > OOM,
>> >> > > > > > > > > > > and
>> >> > > > > > > > > > > > > we
>> >> > > > > > > > > > > > > > > > don't
>> >> > > > > > > > > > > > > > > > > > even
>> >> > > > > > > > > > > > > > > > > > > > > need
>> >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / network
>> >> > memory
>> >> > > > with
>> >> > > > > > > > > > > > > > Unsafe.allocate() .
>> >> > > > > > > > > > > > > > > > > > > > > > However, there are also some
>> >> down
>> >> > > sides
>> >> > > > > of
>> >> > > > > > > > doing
>> >> > > > > > > > > > > this.
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > >    - One thing I can think
>> of is
>> >> > that
>> >> > > > if
>> >> > > > > a
>> >> > > > > > > task
>> >> > > > > > > > > > > > executor
>> >> > > > > > > > > > > > > > > > > container
>> >> > > > > > > > > > > > > > > > > > is
>> >> > > > > > > > > > > > > > > > > > > > > >    killed due to overusing
>> >> memory,
>> >> > it
>> >> > > > > could
>> >> > > > > > > be
>> >> > > > > > > > > hard
>> >> > > > > > > > > > > for
>> >> > > > > > > > > > > > > use
>> >> > > > > > > > > > > > > > > to
>> >> > > > > > > > > > > > > > > > > know
>> >> > > > > > > > > > > > > > > > > > > > which
>> >> > > > > > > > > > > > > > > > > > > > > > part
>> >> > > > > > > > > > > > > > > > > > > > > >    of the memory is overused.
>> >> > > > > > > > > > > > > > > > > > > > > >    - Another down side is
>> that
>> >> the
>> >> > > JVM
>> >> > > > > > never
>> >> > > > > > > > > > trigger
>> >> > > > > > > > > > > GC
>> >> > > > > > > > > > > > > due
>> >> > > > > > > > > > > > > > > to
>> >> > > > > > > > > > > > > > > > > > > reaching
>> >> > > > > > > > > > > > > > > > > > > > > max
>> >> > > > > > > > > > > > > > > > > > > > > >    direct memory limit,
>> because
>> >> the
>> >> > > > limit
>> >> > > > > > is
>> >> > > > > > > > too
>> >> > > > > > > > > > high
>> >> > > > > > > > > > > > to
>> >> > > > > > > > > > > > > be
>> >> > > > > > > > > > > > > > > > > > reached.
>> >> > > > > > > > > > > > > > > > > > > > That
>> >> > > > > > > > > > > > > > > > > > > > > >    means we kind of relay on
>> >> heap
>> >> > > > memory
>> >> > > > > to
>> >> > > > > > > > > trigger
>> >> > > > > > > > > > > GC
>> >> > > > > > > > > > > > > and
>> >> > > > > > > > > > > > > > > > > release
>> >> > > > > > > > > > > > > > > > > > > > direct
>> >> > > > > > > > > > > > > > > > > > > > > >    memory. That could be a
>> >> problem
>> >> > in
>> >> > > > > cases
>> >> > > > > > > > where
>> >> > > > > > > > > > we
>> >> > > > > > > > > > > > have
>> >> > > > > > > > > > > > > > > more
>> >> > > > > > > > > > > > > > > > > > direct
>> >> > > > > > > > > > > > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > > > > > > > >    usage but not enough heap
>> >> > activity
>> >> > > > to
>> >> > > > > > > > trigger
>> >> > > > > > > > > > the
>> >> > > > > > > > > > > > GC.
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your
>> reasons
>> >> > for
>> >> > > > > > > preferring
>> >> > > > > > > > > > > > setting a
>> >> > > > > > > > > > > > > > > very
>> >> > > > > > > > > > > > > > > > > > large
>> >> > > > > > > > > > > > > > > > > > > > > value,
>> >> > > > > > > > > > > > > > > > > > > > > > if there are anything else I
>> >> > > > overlooked.
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation*
>> >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict
>> between
>> >> > > > multiple
>> >> > > > > > > > > > > configuration
>> >> > > > > > > > > > > > > > that
>> >> > > > > > > > > > > > > > > > user
>> >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I
>> think we
>> >> > > should
>> >> > > > > > throw
>> >> > > > > > > > an
>> >> > > > > > > > > > > error.
>> >> > > > > > > > > > > > > > > > > > > > > > I think doing checking on the
>> >> > client
>> >> > > > side
>> >> > > > > > is
>> >> > > > > > > a
>> >> > > > > > > > > good
>> >> > > > > > > > > > > > idea,
>> >> > > > > > > > > > > > > > so
>> >> > > > > > > > > > > > > > > > that
>> >> > > > > > > > > > > > > > > > > > on
>> >> > > > > > > > > > > > > > > > > > > > > Yarn /
>> >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the
>> problem
>> >> > > before
>> >> > > > > > > > submitting
>> >> > > > > > > > > > the
>> >> > > > > > > > > > > > > Flink
>> >> > > > > > > > > > > > > > > > > > cluster,
>> >> > > > > > > > > > > > > > > > > > > > > which
>> >> > > > > > > > > > > > > > > > > > > > > > is always a good thing.
>> >> > > > > > > > > > > > > > > > > > > > > > But we can not only rely on
>> the
>> >> > > client
>> >> > > > > side
>> >> > > > > > > > > > checking,
>> >> > > > > > > > > > > > > > because
>> >> > > > > > > > > > > > > > > > for
>> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster
>> TaskManagers
>> >> on
>> >> > > > > > different
>> >> > > > > > > > > > machines
>> >> > > > > > > > > > > > may
>> >> > > > > > > > > > > > > > > have
>> >> > > > > > > > > > > > > > > > > > > > different
>> >> > > > > > > > > > > > > > > > > > > > > > configurations and the client
>> >> does
>> >> > > see
>> >> > > > > > that.
>> >> > > > > > > > > > > > > > > > > > > > > > What do you think?
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > Thank you~
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09
>> PM
>> >> Yang
>> >> > > > Wang
>> >> > > > > <
>> >> > > > > > > > > > > > > > > > danrtsey...@gmail.com>
>> >> > > > > > > > > > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong,
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed
>> >> > proposal.
>> >> > > > > After
>> >> > > > > > > all
>> >> > > > > > > > > the
>> >> > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > > > > > configuration
>> >> > > > > > > > > > > > > > > > > > > > > are
>> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be more
>> >> > > powerful
>> >> > > > to
>> >> > > > > > > > control
>> >> > > > > > > > > > the
>> >> > > > > > > > > > > > > flink
>> >> > > > > > > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > > > > > > > usage. I
>> >> > > > > > > > > > > > > > > > > > > > > > > just have few questions
>> about
>> >> it.
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >    - Native and Direct
>> Memory
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate
>> user
>> >> > direct
>> >> > > > > > memory
>> >> > > > > > > > and
>> >> > > > > > > > > > > native
>> >> > > > > > > > > > > > > > > memory.
>> >> > > > > > > > > > > > > > > > > > They
>> >> > > > > > > > > > > > > > > > > > > > are
>> >> > > > > > > > > > > > > > > > > > > > > > all
>> >> > > > > > > > > > > > > > > > > > > > > > > included in task off-heap
>> >> memory.
>> >> > > > > Right?
>> >> > > > > > > So i
>> >> > > > > > > > > > don’t
>> >> > > > > > > > > > > > > think
>> >> > > > > > > > > > > > > > > we
>> >> > > > > > > > > > > > > > > > > > could
>> >> > > > > > > > > > > > > > > > > > > > not
>> >> > > > > > > > > > > > > > > > > > > > > > set
>> >> > > > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize
>> >> > > > properly. I
>> >> > > > > > > > prefer
>> >> > > > > > > > > > > > leaving
>> >> > > > > > > > > > > > > > it a
>> >> > > > > > > > > > > > > > > > > very
>> >> > > > > > > > > > > > > > > > > > > > large
>> >> > > > > > > > > > > > > > > > > > > > > > > value.
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >    - Memory Calculation
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and
>> fine-grained
>> >> > > > > > > memory(network
>> >> > > > > > > > > > > memory,
>> >> > > > > > > > > > > > > > > managed
>> >> > > > > > > > > > > > > > > > > > > memory,
>> >> > > > > > > > > > > > > > > > > > > > > > etc.)
>> >> > > > > > > > > > > > > > > > > > > > > > > is larger than total
>> process
>> >> > > memory,
>> >> > > > > how
>> >> > > > > > do
>> >> > > > > > > > we
>> >> > > > > > > > > > deal
>> >> > > > > > > > > > > > > with
>> >> > > > > > > > > > > > > > > this
>> >> > > > > > > > > > > > > > > > > > > > > situation?
>> >> > > > > > > > > > > > > > > > > > > > > > Do
>> >> > > > > > > > > > > > > > > > > > > > > > > we need to check the memory
>> >> > > > > configuration
>> >> > > > > > > in
>> >> > > > > > > > > > > client?
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song <
>> >> > > tonysong...@gmail.com>
>> >> > > > > > > > > > 于2019年8月7日周三
>> >> > > > > > > > > > > > > > > 下午10:14写道：
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone,
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > We would like to start a
>> >> > > discussion
>> >> > > > > > > thread
>> >> > > > > > > > on
>> >> > > > > > > > > > > > > "FLIP-49:
>> >> > > > > > > > > > > > > > > > > Unified
>> >> > > > > > > > > > > > > > > > > > > > > Memory
>> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for
>> >> > > > TaskExecutors"[1],
>> >> > > > > > > where
>> >> > > > > > > > we
>> >> > > > > > > > > > > > > describe
>> >> > > > > > > > > > > > > > > how
>> >> > > > > > > > > > > > > > > > to
>> >> > > > > > > > > > > > > > > > > > > > improve
>> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory
>> >> > > configurations.
>> >> > > > > The
>> >> > > > > > > > FLIP
>> >> > > > > > > > > > > > document
>> >> > > > > > > > > > > > > > is
>> >> > > > > > > > > > > > > > > > > mostly
>> >> > > > > > > > > > > > > > > > > > > > based
>> >> > > > > > > > > > > > > > > > > > > > > > on
>> >> > > > > > > > > > > > > > > > > > > > > > > an
>> >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory
>> >> Management
>> >> > > and
>> >> > > > > > > > > > Configuration
>> >> > > > > > > > > > > > > > > > > Reloaded"[2]
>> >> > > > > > > > > > > > > > > > > > by
>> >> > > > > > > > > > > > > > > > > > > > > > > Stephan,
>> >> > > > > > > > > > > > > > > > > > > > > > > > with updates from
>> follow-up
>> >> > > > > discussions
>> >> > > > > > > > both
>> >> > > > > > > > > > > online
>> >> > > > > > > > > > > > > and
>> >> > > > > > > > > > > > > > > > > > offline.
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses
>> several
>> >> > > > > > shortcomings
>> >> > > > > > > of
>> >> > > > > > > > > > > current
>> >> > > > > > > > > > > > > > > (Flink
>> >> > > > > > > > > > > > > > > > > 1.9)
>> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory
>> >> > > configuration.
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >    - Different
>> configuration
>> >> > for
>> >> > > > > > > Streaming
>> >> > > > > > > > > and
>> >> > > > > > > > > > > > Batch.
>> >> > > > > > > > > > > > > > > > > > > > > > > >    - Complex and
>> difficult
>> >> > > > > > configuration
>> >> > > > > > > of
>> >> > > > > > > > > > > RocksDB
>> >> > > > > > > > > > > > > in
>> >> > > > > > > > > > > > > > > > > > Streaming.
>> >> > > > > > > > > > > > > > > > > > > > > > > >    - Complicated,
>> uncertain
>> >> and
>> >> > > > hard
>> >> > > > > to
>> >> > > > > > > > > > > understand.
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the
>> >> > problems
>> >> > > > can
>> >> > > > > > be
>> >> > > > > > > > > > > summarized
>> >> > > > > > > > > > > > > as
>> >> > > > > > > > > > > > > > > > > follows.
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >    - Extend memory
>> manager
>> >> to
>> >> > > also
>> >> > > > > > > account
>> >> > > > > > > > > for
>> >> > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > usage
>> >> > > > > > > > > > > > > > > > > by
>> >> > > > > > > > > > > > > > > > > > > > state
>> >> > > > > > > > > > > > > > > > > > > > > > > >    backends.
>> >> > > > > > > > > > > > > > > > > > > > > > > >    - Modify how
>> TaskExecutor
>> >> > > memory
>> >> > > > > is
>> >> > > > > > > > > > > partitioned
>> >> > > > > > > > > > > > > > > > accounted
>> >> > > > > > > > > > > > > > > > > > > > > individual
>> >> > > > > > > > > > > > > > > > > > > > > > > >    memory reservations
>> and
>> >> > pools.
>> >> > > > > > > > > > > > > > > > > > > > > > > >    - Simplify memory
>> >> > > configuration
>> >> > > > > > > options
>> >> > > > > > > > > and
>> >> > > > > > > > > > > > > > > calculations
>> >> > > > > > > > > > > > > > > > > > > logics.
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > Please find more details
>> in
>> >> the
>> >> > > > FLIP
>> >> > > > > > wiki
>> >> > > > > > > > > > > document
>> >> > > > > > > > > > > > > [1].
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the
>> early
>> >> > > design
>> >> > > > > doc
>> >> > > > > > > [2]
>> >> > > > > > > > is
>> >> > > > > > > > > > out
>> >> > > > > > > > > > > > of
>> >> > > > > > > > > > > > > > > sync,
>> >> > > > > > > > > > > > > > > > > and
>> >> > > > > > > > > > > > > > > > > > it
>> >> > > > > > > > > > > > > > > > > > > > is
>> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the
>> >> > > discussion
>> >> > > > in
>> >> > > > > > > this
>> >> > > > > > > > > > > mailing
>> >> > > > > > > > > > > > > list
>> >> > > > > > > > > > > > > > > > > > thread.)
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your
>> >> > > feedbacks.
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > [1]
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > >
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > >
>> >> > > > > > > > > >
>> >> > > > > > > > >
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > [2]
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > >
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > >
>> >> > > > > > > > > >
>> >> > > > > > > > >
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>> https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song
>> <
>> >> > > > > > > > > > tonysong...@gmail.com>
>> >> > > > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till.
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was
>> >> > wondering
>> >> > > > > > whether
>> >> > > > > > > > we
>> >> > > > > > > > > > can
>> >> > > > > > > > > > > > > avoid
>> >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap
>> managed
>> >> > memory
>> >> > > > and
>> >> > > > > > > > network
>> >> > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > with
>> >> > > > > > > > > > > > > > > alternative 3. But after giving it a second
>> >> > > thought,
>> >> > > > I
>> >> > > > > > > think
>> >> > > > > > > > > even
>> >> > > > > > > > > > > for
>> >> > > > > > > > > > > > > > > alternative 3 using direct memory for
>> off-heap
>> >> > > > managed
>> >> > > > > > > memory
>> >> > > > > > > > > > could
>> >> > > > > > > > > > > > > cause
>> >> > > > > > > > > > > > > > > problems.
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > Hi Yang,
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > Regarding your concern, I think what
>> proposed
>> >> in
>> >> > > this
>> >> > > > > > FLIP
>> >> > > > > > > it
>> >> > > > > > > > > to
>> >> > > > > > > > > > > have
>> >> > > > > > > > > > > > > > both
>> >> > > > > > > > > > > > > > > off-heap managed memory and network memory
>> >> > > allocated
>> >> > > > > > > through
>> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are
>> >> > practically
>> >> > > > > > native
>> >> > > > > > > > > memory
>> >> > > > > > > > > > > and
>> >> > > > > > > > > > > > > not
>> >> > > > > > > > > > > > > > > limited by JVM max direct memory. The only
>> >> parts
>> >> > of
>> >> > > > > > memory
>> >> > > > > > > > > > limited
>> >> > > > > > > > > > > by
>> >> > > > > > > > > > > > > JVM
>> >> > > > > > > > > > > > > > > max direct memory are task off-heap memory
>> and
>> >> > JVM
>> >> > > > > > > overhead,
>> >> > > > > > > > > > which
>> >> > > > > > > > > > > > are
>> >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set the
>> JVM
>> >> max
>> >> > > > > direct
>> >> > > > > > > > memory
>> >> > > > > > > > > > to.
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > Thank you~
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > Xintong Song
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till
>> Rohrmann
>> >> <
>> >> > > > > > > > > > > trohrm...@apache.org>
>> >> > > > > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I
>> >> > > understand
>> >> > > > > the
>> >> > > > > > > two
>> >> > > > > > > > > > > > > alternatives
>> >> > > > > > > > > > > > > > > > now.
>> >> > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > I would be in favour of option 2 because
>> it
>> >> > makes
>> >> > > > > > things
>> >> > > > > > > > > > > explicit.
>> >> > > > > > > > > > > > If
>> >> > > > > > > > > > > > > > we
>> >> > > > > > > > > > > > > > > > don't limit the direct memory, I fear
>> that
>> >> we
>> >> > > might
>> >> > > > > end
>> >> > > > > > > up
>> >> > > > > > > > > in a
>> >> > > > > > > > > > > > > similar
>> >> > > > > > > > > > > > > > > > situation as we are currently in: The
>> user
>> >> > might
>> >> > > > see
>> >> > > > > > that
>> >> > > > > > > > her
>> >> > > > > > > > > > > > process
>> >> > > > > > > > > > > > > > > gets
>> >> > > > > > > > > > > > > > > > killed by the OS and does not know why
>> this
>> >> is
>> >> > > the
>> >> > > > > > case.
>> >> > > > > > > > > > > > > Consequently,
>> >> > > > > > > > > > > > > > > she
>> >> > > > > > > > > > > > > > > > tries to decrease the process memory size
>> >> > > (similar
>> >> > > > to
>> >> > > > > > > > > > increasing
>> >> > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > cutoff
>> >> > > > > > > > > > > > > > > > ratio) in order to accommodate for the
>> extra
>> >> > > direct
>> >> > > > > > > memory.
>> >> > > > > > > > > > Even
>> >> > > > > > > > > > > > > worse,
>> >> > > > > > > > > > > > > > > she
>> >> > > > > > > > > > > > > > > > tries to decrease memory budgets which
>> are
>> >> not
>> >> > > > fully
>> >> > > > > > used
>> >> > > > > > > > and
>> >> > > > > > > > > > > hence
>> >> > > > > > > > > > > > > > won't
>> >> > > > > > > > > > > > > > > > change the overall memory consumption.
>> >> > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > Cheers,
>> >> > > > > > > > > > > > > > > > Till
>> >> > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong
>> >> Song <
>> >> > > > > > > > > > > > tonysong...@gmail.com
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > Let me explain this with a concrete
>> >> example
>> >> > > Till.
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > Let's say we have the following
>> scenario.
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB
>> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap
>> Memory +
>> >> JVM
>> >> > > > > > > Overhead):
>> >> > > > > > > > > > 200MB
>> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM
>> >> Metaspace,
>> >> > > > > > Off-Heap
>> >> > > > > > > > > > Managed
>> >> > > > > > > > > > > > > Memory
>> >> > > > > > > > > > > > > > > and
>> >> > > > > > > > > > > > > > > > > Network Memory): 800MB
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > For alternative 2, we set
>> >> > > -XX:MaxDirectMemorySize
>> >> > > > > to
>> >> > > > > > > > 200MB.
>> >> > > > > > > > > > > > > > > > > For alternative 3, we set
>> >> > > -XX:MaxDirectMemorySize
>> >> > > > > to
>> >> > > > > > a
>> >> > > > > > > > very
>> >> > > > > > > > > > > large
>> >> > > > > > > > > > > > > > > value,
>> >> > > > > > > > > > > > > > > > > let's say 1TB.
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > If the actual direct memory usage of
>> Task
>> >> > > > Off-Heap
>> >> > > > > > > Memory
>> >> > > > > > > > > and
>> >> > > > > > > > > > > JVM
>> >> > > > > > > > > > > > > > > > Overhead
>> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2
>> >> and
>> >> > > > > > > alternative 3
>> >> > > > > > > > > > > should
>> >> > > > > > > > > > > > > have
>> >> > > > > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > > > same utility. Setting larger
>> >> > > > > -XX:MaxDirectMemorySize
>> >> > > > > > > will
>> >> > > > > > > > > not
>> >> > > > > > > > > > > > > reduce
>> >> > > > > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > > > sizes of the other memory pools.
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > If the actual direct memory usage of
>> Task
>> >> > > > Off-Heap
>> >> > > > > > > Memory
>> >> > > > > > > > > and
>> >> > > > > > > > > > > JVM
>> >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > >    - Alternative 2 suffers from
>> frequent
>> >> OOM.
>> >> > > To
>> >> > > > > > avoid
>> >> > > > > > > > > that,
>> >> > > > > > > > > > > the
>> >> > > > > > > > > > > > > only
>> >> > > > > > > > > > > > > > > > thing
>> >> > > > > > > > > > > > > > > > >    user can do is to modify the
>> >> configuration
>> >> > > and
>> >> > > > > > > > increase
>> >> > > > > > > > > > JVM
>> >> > > > > > > > > > > > > Direct
>> >> > > > > > > > > > > > > > > > > Memory
>> >> > > > > > > > > > > > > > > > >    (Task Off-Heap Memory + JVM
>> Overhead).
>> >> > Let's
>> >> > > > say
>> >> > > > > > > that
>> >> > > > > > > > > user
>> >> > > > > > > > > > > > > > increases
>> >> > > > > > > > > > > > > > > > JVM
>> >> > > > > > > > > > > > > > > > >    Direct Memory to 250MB, this will
>> >> reduce
>> >> > the
>> >> > > > > total
>> >> > > > > > > > size
>> >> > > > > > > > > of
>> >> > > > > > > > > > > > other
>> >> > > > > > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > > >    pools to 750MB, given the total
>> process
>> >> > > memory
>> >> > > > > > > remains
>> >> > > > > > > > > > 1GB.
>> >> > > > > > > > > > > > > > > > >    - For alternative 3, there is no
>> >> chance of
>> >> > > > > direct
>> >> > > > > > > OOM.
>> >> > > > > > > > > > There
>> >> > > > > > > > > > > > are
>> >> > > > > > > > > > > > > > > > chances
>> >> > > > > > > > > > > > > > > > >    of exceeding the total process
>> memory
>> >> > limit,
>> >> > > > but
>> >> > > > > > > given
>> >> > > > > > > > > > that
>> >> > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > > process
>> >> > > > > > > > > > > > > > > > > may
>> >> > > > > > > > > > > > > > > > >    not use up all the reserved native
>> >> memory
>> >> > > > > > (Off-Heap
>> >> > > > > > > > > > Managed
>> >> > > > > > > > > > > > > > Memory,
>> >> > > > > > > > > > > > > > > > > Network
>> >> > > > > > > > > > > > > > > > >    Memory, JVM Metaspace), if the
>> actual
>> >> > direct
>> >> > > > > > memory
>> >> > > > > > > > > usage
>> >> > > > > > > > > > is
>> >> > > > > > > > > > > > > > > slightly
>> >> > > > > > > > > > > > > > > > > above
>> >> > > > > > > > > > > > > > > > >    yet very close to 200MB, user
>> probably
>> >> do
>> >> > > not
>> >> > > > > need
>> >> > > > > > > to
>> >> > > > > > > > > > change
>> >> > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > > >    configurations.
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > Therefore, I think from the user's
>> >> > > perspective, a
>> >> > > > > > > > feasible
>> >> > > > > > > > > > > > > > > configuration
>> >> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower
>> >> resource
>> >> > > > > > > utilization
>> >> > > > > > > > > > > compared
>> >> > > > > > > > > > > > > to
>> >> > > > > > > > > > > > > > > > > alternative 3.
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > Thank you~
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > Xintong Song
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till
>> >> > Rohrmann
>> >> > > <
>> >> > > > > > > > > > > > > trohrm...@apache.org
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > I guess you have to help me
>> understand
>> >> the
>> >> > > > > > difference
>> >> > > > > > > > > > between
>> >> > > > > > > > > > > > > > > > > alternative 2
>> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under utilization
>> >> > > Xintong.
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > - Alternative 2: set
>> >> XX:MaxDirectMemorySize
>> >> > > to
>> >> > > > > Task
>> >> > > > > > > > > > Off-Heap
>> >> > > > > > > > > > > > > Memory
>> >> > > > > > > > > > > > > > > and
>> >> > > > > > > > > > > > > > > > > JVM
>> >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk that
>> >> this
>> >> > > size
>> >> > > > > is
>> >> > > > > > > too
>> >> > > > > > > > > low
>> >> > > > > > > > > > > > > > resulting
>> >> > > > > > > > > > > > > > > > in a
>> >> > > > > > > > > > > > > > > > > > lot of garbage collection and
>> >> potentially
>> >> > an
>> >> > > > OOM.
>> >> > > > > > > > > > > > > > > > > > - Alternative 3: set
>> >> XX:MaxDirectMemorySize
>> >> > > to
>> >> > > > > > > > something
>> >> > > > > > > > > > > larger
>> >> > > > > > > > > > > > > > than
>> >> > > > > > > > > > > > > > > > > > alternative 2. This would of course
>> >> reduce
>> >> > > the
>> >> > > > > > sizes
>> >> > > > > > > of
>> >> > > > > > > > > the
>> >> > > > > > > > > > > > other
>> >> > > > > > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > > > > types.
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > How would alternative 2 now result
>> in an
>> >> > > under
>> >> > > > > > > > > utilization
>> >> > > > > > > > > > of
>> >> > > > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > > > > compared to alternative 3? If
>> >> alternative 3
>> >> > > > > > strictly
>> >> > > > > > > > > sets a
>> >> > > > > > > > > > > > > higher
>> >> > > > > > > > > > > > > > > max
>> >> > > > > > > > > > > > > > > > > > direct memory size and we use only
>> >> little,
>> >> > > > then I
>> >> > > > > > > would
>> >> > > > > > > > > > > expect
>> >> > > > > > > > > > > > > that
>> >> > > > > > > > > > > > > > > > > > alternative 3 results in memory under
>> >> > > > > utilization.
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > Cheers,
>> >> > > > > > > > > > > > > > > > > > Till
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang
>> >> Wang <
>> >> > > > > > > > > > > > danrtsey...@gmail.com
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > Hi xintong,till
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > My point is setting a very large
>> max
>> >> > direct
>> >> > > > > > memory
>> >> > > > > > > > size
>> >> > > > > > > > > > > when
>> >> > > > > > > > > > > > we
>> >> > > > > > > > > > > > > > do
>> >> > > > > > > > > > > > > > > > not
>> >> > > > > > > > > > > > > > > > > > > differentiate direct and native
>> >> memory.
>> >> > If
>> >> > > > the
>> >> > > > > > > direct
>> >> > > > > > > > > > > > > > > > memory,including
>> >> > > > > > > > > > > > > > > > > > user
>> >> > > > > > > > > > > > > > > > > > > direct memory and framework direct
>> >> > > > memory,could
>> >> > > > > > be
>> >> > > > > > > > > > > calculated
>> >> > > > > > > > > > > > > > > > > > > correctly,then
>> >> > > > > > > > > > > > > > > > > > > i am in favor of setting direct
>> memory
>> >> > with
>> >> > > > > fixed
>> >> > > > > > > > > value.
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > Memory Calculation
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and
>> >> k8s,we
>> >> > > > need
>> >> > > > > to
>> >> > > > > > > > check
>> >> > > > > > > > > > the
>> >> > > > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > > > > > configurations in client to avoid
>> >> > > submitting
>> >> > > > > > > > > successfully
>> >> > > > > > > > > > > and
>> >> > > > > > > > > > > > > > > failing
>> >> > > > > > > > > > > > > > > > > in
>> >> > > > > > > > > > > > > > > > > > > the flink master.
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > Best,
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > Yang
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > Xintong Song <
>> tonysong...@gmail.com
>> >> > > > > >于2019年8月13日
>> >> > > > > > > > > > 周二22:07写道：
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till.
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you
>> are
>> >> > > right
>> >> > > > > that
>> >> > > > > > > we
>> >> > > > > > > > > > should
>> >> > > > > > > > > > > > not
>> >> > > > > > > > > > > > > > > > include
>> >> > > > > > > > > > > > > > > > > > > this
>> >> > > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP.
>> >> This
>> >> > > FLIP
>> >> > > > > > should
>> >> > > > > > > > > > > > concentrate
>> >> > > > > > > > > > > > > > on
>> >> > > > > > > > > > > > > > > > how
>> >> > > > > > > > > > > > > > > > > to
>> >> > > > > > > > > > > > > > > > > > > > configure memory pools for
>> >> > TaskExecutors,
>> >> > > > > with
>> >> > > > > > > > > minimum
>> >> > > > > > > > > > > > > > > involvement
>> >> > > > > > > > > > > > > > > > on
>> >> > > > > > > > > > > > > > > > > > how
>> >> > > > > > > > > > > > > > > > > > > > memory consumers use it.
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > About direct memory, I think
>> >> > alternative
>> >> > > 3
>> >> > > > > may
>> >> > > > > > > not
>> >> > > > > > > > > > having
>> >> > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > same
>> >> > > > > > > > > > > > > > > > > over
>> >> > > > > > > > > > > > > > > > > > > > reservation issue that
>> alternative 2
>> >> > > does,
>> >> > > > > but
>> >> > > > > > at
>> >> > > > > > > > the
>> >> > > > > > > > > > > cost
>> >> > > > > > > > > > > > of
>> >> > > > > > > > > > > > > > > risk
>> >> > > > > > > > > > > > > > > > of
>> >> > > > > > > > > > > > > > > > > > > over
>> >> > > > > > > > > > > > > > > > > > > > using memory at the container
>> level,
>> >> > > which
>> >> > > > is
>> >> > > > > > not
>> >> > > > > > > > > good.
>> >> > > > > > > > > > > My
>> >> > > > > > > > > > > > > > point
>> >> > > > > > > > > > > > > > > is
>> >> > > > > > > > > > > > > > > > > > that
>> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and
>> "JVM
>> >> > > > > Overhead"
>> >> > > > > > > are
>> >> > > > > > > > > not
>> >> > > > > > > > > > > easy
>> >> > > > > > > > > > > > > to
>> >> > > > > > > > > > > > > > > > > config.
>> >> > > > > > > > > > > > > > > > > > > For
>> >> > > > > > > > > > > > > > > > > > > > alternative 2, users might
>> configure
>> >> > them
>> >> > > > > > higher
>> >> > > > > > > > than
>> >> > > > > > > > > > > what
>> >> > > > > > > > > > > > > > > actually
>> >> > > > > > > > > > > > > > > > > > > needed,
>> >> > > > > > > > > > > > > > > > > > > > just to avoid getting a direct
>> OOM.
>> >> For
>> >> > > > > > > alternative
>> >> > > > > > > > > 3,
>> >> > > > > > > > > > > > users
>> >> > > > > > > > > > > > > do
>> >> > > > > > > > > > > > > > > not
>> >> > > > > > > > > > > > > > > > > get
>> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not
>> config
>> >> the
>> >> > > two
>> >> > > > > > > options
>> >> > > > > > > > > > > > > aggressively
>> >> > > > > > > > > > > > > > > > high.
>> >> > > > > > > > > > > > > > > > > > But
>> >> > > > > > > > > > > > > > > > > > > > the consequences are risks of
>> >> overall
>> >> > > > > container
>> >> > > > > > > > > memory
>> >> > > > > > > > > > > > usage
>> >> > > > > > > > > > > > > > > > exceeds
>> >> > > > > > > > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > > > > > > budget.
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > Thank you~
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > Xintong Song
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM
>> Till
>> >> > > > > Rohrmann <
>> >> > > > > > > > > > > > > > > > trohrm...@apache.org>
>> >> > > > > > > > > > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP
>> >> > Xintong.
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > All in all I think it already
>> >> looks
>> >> > > quite
>> >> > > > > > good.
>> >> > > > > > > > > > > > Concerning
>> >> > > > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > > > first
>> >> > > > > > > > > > > > > > > > > > > open
>> >> > > > > > > > > > > > > > > > > > > > > question about allocating
>> memory
>> >> > > > segments,
>> >> > > > > I
>> >> > > > > > > was
>> >> > > > > > > > > > > > wondering
>> >> > > > > > > > > > > > > > > > whether
>> >> > > > > > > > > > > > > > > > > > this
>> >> > > > > > > > > > > > > > > > > > > > is
>> >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in the
>> >> > context
>> >> > > > of
>> >> > > > > > this
>> >> > > > > > > > > FLIP
>> >> > > > > > > > > > or
>> >> > > > > > > > > > > > > > whether
>> >> > > > > > > > > > > > > > > > > this
>> >> > > > > > > > > > > > > > > > > > > > could
>> >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? Without
>> >> > knowing
>> >> > > > all
>> >> > > > > > > > > details,
>> >> > > > > > > > > > I
>> >> > > > > > > > > > > > > would
>> >> > > > > > > > > > > > > > be
>> >> > > > > > > > > > > > > > > > > > > concerned
>> >> > > > > > > > > > > > > > > > > > > > > that we would widen the scope
>> of
>> >> this
>> >> > > > FLIP
>> >> > > > > > too
>> >> > > > > > > > much
>> >> > > > > > > > > > > > because
>> >> > > > > > > > > > > > > > we
>> >> > > > > > > > > > > > > > > > > would
>> >> > > > > > > > > > > > > > > > > > > have
>> >> > > > > > > > > > > > > > > > > > > > > to touch all the existing call
>> >> sites
>> >> > of
>> >> > > > the
>> >> > > > > > > > > > > MemoryManager
>> >> > > > > > > > > > > > > > where
>> >> > > > > > > > > > > > > > > > we
>> >> > > > > > > > > > > > > > > > > > > > allocate
>> >> > > > > > > > > > > > > > > > > > > > > memory segments (this should
>> >> mainly
>> >> > be
>> >> > > > > batch
>> >> > > > > > > > > > > operators).
>> >> > > > > > > > > > > > > The
>> >> > > > > > > > > > > > > > > > > addition
>> >> > > > > > > > > > > > > > > > > > > of
>> >> > > > > > > > > > > > > > > > > > > > > the memory reservation call to
>> the
>> >> > > > > > > MemoryManager
>> >> > > > > > > > > > should
>> >> > > > > > > > > > > > not
>> >> > > > > > > > > > > > > > be
>> >> > > > > > > > > > > > > > > > > > affected
>> >> > > > > > > > > > > > > > > > > > > > by
>> >> > > > > > > > > > > > > > > > > > > > > this and I would hope that
>> this is
>> >> > the
>> >> > > > only
>> >> > > > > > > point
>> >> > > > > > > > > of
>> >> > > > > > > > > > > > > > > interaction
>> >> > > > > > > > > > > > > > > > a
>> >> > > > > > > > > > > > > > > > > > > > > streaming job would have with
>> the
>> >> > > > > > > MemoryManager.
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > Concerning the second open
>> >> question
>> >> > > about
>> >> > > > > > > setting
>> >> > > > > > > > > or
>> >> > > > > > > > > > > not
>> >> > > > > > > > > > > > > > > setting
>> >> > > > > > > > > > > > > > > > a
>> >> > > > > > > > > > > > > > > > > > max
>> >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I would
>> also
>> >> be
>> >> > > > > > interested
>> >> > > > > > > > why
>> >> > > > > > > > > > > Yang
>> >> > > > > > > > > > > > > Wang
>> >> > > > > > > > > > > > > > > > > thinks
>> >> > > > > > > > > > > > > > > > > > > > > leaving it open would be best.
>> My
>> >> > > concern
>> >> > > > > > about
>> >> > > > > > > > > this
>> >> > > > > > > > > > > > would
>> >> > > > > > > > > > > > > be
>> >> > > > > > > > > > > > > > > > that
>> >> > > > > > > > > > > > > > > > > we
>> >> > > > > > > > > > > > > > > > > > > > would
>> >> > > > > > > > > > > > > > > > > > > > > be in a similar situation as we
>> >> are
>> >> > now
>> >> > > > > with
>> >> > > > > > > the
>> >> > > > > > > > > > > > > > > > > RocksDBStateBackend.
>> >> > > > > > > > > > > > > > > > > > > If
>> >> > > > > > > > > > > > > > > > > > > > > the different memory pools are
>> not
>> >> > > > clearly
>> >> > > > > > > > > separated
>> >> > > > > > > > > > > and
>> >> > > > > > > > > > > > > can
>> >> > > > > > > > > > > > > > > > spill
>> >> > > > > > > > > > > > > > > > > > over
>> >> > > > > > > > > > > > > > > > > > > > to
>> >> > > > > > > > > > > > > > > > > > > > > a different pool, then it is
>> quite
>> >> > hard
>> >> > > > to
>> >> > > > > > > > > understand
>> >> > > > > > > > > > > > what
>> >> > > > > > > > > > > > > > > > exactly
>> >> > > > > > > > > > > > > > > > > > > > causes a
>> >> > > > > > > > > > > > > > > > > > > > > process to get killed for using
>> >> too
>> >> > > much
>> >> > > > > > > memory.
>> >> > > > > > > > > This
>> >> > > > > > > > > > > > could
>> >> > > > > > > > > > > > > > > then
>> >> > > > > > > > > > > > > > > > > > easily
>> >> > > > > > > > > > > > > > > > > > > > > lead to a similar situation
>> what
>> >> we
>> >> > > have
>> >> > > > > with
>> >> > > > > > > the
>> >> > > > > > > > > > > > > > cutoff-ratio.
>> >> > > > > > > > > > > > > > > > So
>> >> > > > > > > > > > > > > > > > > > why
>> >> > > > > > > > > > > > > > > > > > > > not
>> >> > > > > > > > > > > > > > > > > > > > > setting a sane default value
>> for
>> >> max
>> >> > > > direct
>> >> > > > > > > > memory
>> >> > > > > > > > > > and
>> >> > > > > > > > > > > > > giving
>> >> > > > > > > > > > > > > > > the
>> >> > > > > > > > > > > > > > > > > > user
>> >> > > > > > > > > > > > > > > > > > > an
>> >> > > > > > > > > > > > > > > > > > > > > option to increase it if he
>> runs
>> >> into
>> >> > > an
>> >> > > > > OOM.
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would
>> alternative 2
>> >> > lead
>> >> > > to
>> >> > > > > > lower
>> >> > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > > utilization
>> >> > > > > > > > > > > > > > > > > > than
>> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set the
>> >> direct
>> >> > > > > memory
>> >> > > > > > > to a
>> >> > > > > > > > > > > higher
>> >> > > > > > > > > > > > > > value?
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > Cheers,
>> >> > > > > > > > > > > > > > > > > > > > > Till
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM
>> >> > Xintong
>> >> > > > > Song <
>> >> > > > > > > > > > > > > > > > tonysong...@gmail.com
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback,
>> Yang.
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments:
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory*
>> >> > > > > > > > > > > > > > > > > > > > > > I think setting a very large
>> max
>> >> > > direct
>> >> > > > > > > memory
>> >> > > > > > > > > size
>> >> > > > > > > > > > > > > > > definitely
>> >> > > > > > > > > > > > > > > > > has
>> >> > > > > > > > > > > > > > > > > > > some
>> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not
>> >> worry
>> >> > > about
>> >> > > > > > > direct
>> >> > > > > > > > > OOM,
>> >> > > > > > > > > > > and
>> >> > > > > > > > > > > > > we
>> >> > > > > > > > > > > > > > > > don't
>> >> > > > > > > > > > > > > > > > > > even
>> >> > > > > > > > > > > > > > > > > > > > > need
>> >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / network
>> >> > memory
>> >> > > > with
>> >> > > > > > > > > > > > > > Unsafe.allocate() .
>> >> > > > > > > > > > > > > > > > > > > > > > However, there are also some
>> >> down
>> >> > > sides
>> >> > > > > of
>> >> > > > > > > > doing
>> >> > > > > > > > > > > this.
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > >    - One thing I can think
>> of is
>> >> > that
>> >> > > > if
>> >> > > > > a
>> >> > > > > > > task
>> >> > > > > > > > > > > > executor
>> >> > > > > > > > > > > > > > > > > container
>> >> > > > > > > > > > > > > > > > > > is
>> >> > > > > > > > > > > > > > > > > > > > > >    killed due to overusing
>> >> memory,
>> >> > it
>> >> > > > > could
>> >> > > > > > > be
>> >> > > > > > > > > hard
>> >> > > > > > > > > > > for
>> >> > > > > > > > > > > > > use
>> >> > > > > > > > > > > > > > > to
>> >> > > > > > > > > > > > > > > > > know
>> >> > > > > > > > > > > > > > > > > > > > which
>> >> > > > > > > > > > > > > > > > > > > > > > part
>> >> > > > > > > > > > > > > > > > > > > > > >    of the memory is overused.
>> >> > > > > > > > > > > > > > > > > > > > > >    - Another down side is
>> that
>> >> the
>> >> > > JVM
>> >> > > > > > never
>> >> > > > > > > > > > trigger
>> >> > > > > > > > > > > GC
>> >> > > > > > > > > > > > > due
>> >> > > > > > > > > > > > > > > to
>> >> > > > > > > > > > > > > > > > > > > reaching
>> >> > > > > > > > > > > > > > > > > > > > > max
>> >> > > > > > > > > > > > > > > > > > > > > >    direct memory limit,
>> because
>> >> the
>> >> > > > limit
>> >> > > > > > is
>> >> > > > > > > > too
>> >> > > > > > > > > > high
>> >> > > > > > > > > > > > to
>> >> > > > > > > > > > > > > be
>> >> > > > > > > > > > > > > > > > > > reached.
>> >> > > > > > > > > > > > > > > > > > > > That
>> >> > > > > > > > > > > > > > > > > > > > > >    means we kind of relay on
>> >> heap
>> >> > > > memory
>> >> > > > > to
>> >> > > > > > > > > trigger
>> >> > > > > > > > > > > GC
>> >> > > > > > > > > > > > > and
>> >> > > > > > > > > > > > > > > > > release
>> >> > > > > > > > > > > > > > > > > > > > direct
>> >> > > > > > > > > > > > > > > > > > > > > >    memory. That could be a
>> >> problem
>> >> > in
>> >> > > > > cases
>> >> > > > > > > > where
>> >> > > > > > > > > > we
>> >> > > > > > > > > > > > have
>> >> > > > > > > > > > > > > > > more
>> >> > > > > > > > > > > > > > > > > > direct
>> >> > > > > > > > > > > > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > > > > > > > >    usage but not enough heap
>> >> > activity
>> >> > > > to
>> >> > > > > > > > trigger
>> >> > > > > > > > > > the
>> >> > > > > > > > > > > > GC.
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your
>> reasons
>> >> > for
>> >> > > > > > > preferring
>> >> > > > > > > > > > > > setting a
>> >> > > > > > > > > > > > > > > very
>> >> > > > > > > > > > > > > > > > > > large
>> >> > > > > > > > > > > > > > > > > > > > > value,
>> >> > > > > > > > > > > > > > > > > > > > > > if there are anything else I
>> >> > > > overlooked.
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation*
>> >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict
>> between
>> >> > > > multiple
>> >> > > > > > > > > > > configuration
>> >> > > > > > > > > > > > > > that
>> >> > > > > > > > > > > > > > > > user
>> >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I
>> think we
>> >> > > should
>> >> > > > > > throw
>> >> > > > > > > > an
>> >> > > > > > > > > > > error.
>> >> > > > > > > > > > > > > > > > > > > > > > I think doing checking on the
>> >> > client
>> >> > > > side
>> >> > > > > > is
>> >> > > > > > > a
>> >> > > > > > > > > good
>> >> > > > > > > > > > > > idea,
>> >> > > > > > > > > > > > > > so
>> >> > > > > > > > > > > > > > > > that
>> >> > > > > > > > > > > > > > > > > > on
>> >> > > > > > > > > > > > > > > > > > > > > Yarn /
>> >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the
>> problem
>> >> > > before
>> >> > > > > > > > submitting
>> >> > > > > > > > > > the
>> >> > > > > > > > > > > > > Flink
>> >> > > > > > > > > > > > > > > > > > cluster,
>> >> > > > > > > > > > > > > > > > > > > > > which
>> >> > > > > > > > > > > > > > > > > > > > > > is always a good thing.
>> >> > > > > > > > > > > > > > > > > > > > > > But we can not only rely on
>> the
>> >> > > client
>> >> > > > > side
>> >> > > > > > > > > > checking,
>> >> > > > > > > > > > > > > > because
>> >> > > > > > > > > > > > > > > > for
>> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster
>> TaskManagers
>> >> on
>> >> > > > > > different
>> >> > > > > > > > > > machines
>> >> > > > > > > > > > > > may
>> >> > > > > > > > > > > > > > > have
>> >> > > > > > > > > > > > > > > > > > > > different
>> >> > > > > > > > > > > > > > > > > > > > > > configurations and the client
>> >> does
>> >> > > see
>> >> > > > > > that.
>> >> > > > > > > > > > > > > > > > > > > > > > What do you think?
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > Thank you~
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09
>> PM
>> >> Yang
>> >> > > > Wang
>> >> > > > > <
>> >> > > > > > > > > > > > > > > > danrtsey...@gmail.com>
>> >> > > > > > > > > > > > > > > > > > > > wrote:
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong,
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed
>> >> > proposal.
>> >> > > > > After
>> >> > > > > > > all
>> >> > > > > > > > > the
>> >> > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > > > > > configuration
>> >> > > > > > > > > > > > > > > > > > > > > are
>> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be more
>> >> > > powerful
>> >> > > > to
>> >> > > > > > > > control
>> >> > > > > > > > > > the
>> >> > > > > > > > > > > > > flink
>> >> > > > > > > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > > > > > > > usage. I
>> >> > > > > > > > > > > > > > > > > > > > > > > just have few questions
>> about
>> >> it.
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >    - Native and Direct
>> Memory
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate
>> user
>> >> > direct
>> >> > > > > > memory
>> >> > > > > > > > and
>> >> > > > > > > > > > > native
>> >> > > > > > > > > > > > > > > memory.
>> >> > > > > > > > > > > > > > > > > > They
>> >> > > > > > > > > > > > > > > > > > > > are
>> >> > > > > > > > > > > > > > > > > > > > > > all
>> >> > > > > > > > > > > > > > > > > > > > > > > included in task off-heap
>> >> memory.
>> >> > > > > Right?
>> >> > > > > > > So i
>> >> > > > > > > > > > don’t
>> >> > > > > > > > > > > > > think
>> >> > > > > > > > > > > > > > > we
>> >> > > > > > > > > > > > > > > > > > could
>> >> > > > > > > > > > > > > > > > > > > > not
>> >> > > > > > > > > > > > > > > > > > > > > > set
>> >> > > > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize
>> >> > > > properly. I
>> >> > > > > > > > prefer
>> >> > > > > > > > > > > > leaving
>> >> > > > > > > > > > > > > > it a
>> >> > > > > > > > > > > > > > > > > very
>> >> > > > > > > > > > > > > > > > > > > > large
>> >> > > > > > > > > > > > > > > > > > > > > > > value.
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >    - Memory Calculation
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and
>> fine-grained
>> >> > > > > > > memory(network
>> >> > > > > > > > > > > memory,
>> >> > > > > > > > > > > > > > > managed
>> >> > > > > > > > > > > > > > > > > > > memory,
>> >> > > > > > > > > > > > > > > > > > > > > > etc.)
>> >> > > > > > > > > > > > > > > > > > > > > > > is larger than total
>> process
>> >> > > memory,
>> >> > > > > how
>> >> > > > > > do
>> >> > > > > > > > we
>> >> > > > > > > > > > deal
>> >> > > > > > > > > > > > > with
>> >> > > > > > > > > > > > > > > this
>> >> > > > > > > > > > > > > > > > > > > > > situation?
>> >> > > > > > > > > > > > > > > > > > > > > > Do
>> >> > > > > > > > > > > > > > > > > > > > > > > we need to check the memory
>> >> > > > > configuration
>> >> > > > > > > in
>> >> > > > > > > > > > > client?
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song <
>> >> > > tonysong...@gmail.com>
>> >> > > > > > > > > > 于2019年8月7日周三
>> >> > > > > > > > > > > > > > > 下午10:14写道：
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone,
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > We would like to start a
>> >> > > discussion
>> >> > > > > > > thread
>> >> > > > > > > > on
>> >> > > > > > > > > > > > > "FLIP-49:
>> >> > > > > > > > > > > > > > > > > Unified
>> >> > > > > > > > > > > > > > > > > > > > > Memory
>> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for
>> >> > > > TaskExecutors"[1],
>> >> > > > > > > where
>> >> > > > > > > > we
>> >> > > > > > > > > > > > > describe
>> >> > > > > > > > > > > > > > > how
>> >> > > > > > > > > > > > > > > > to
>> >> > > > > > > > > > > > > > > > > > > > improve
>> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory
>> >> > > configurations.
>> >> > > > > The
>> >> > > > > > > > FLIP
>> >> > > > > > > > > > > > document
>> >> > > > > > > > > > > > > > is
>> >> > > > > > > > > > > > > > > > > mostly
>> >> > > > > > > > > > > > > > > > > > > > based
>> >> > > > > > > > > > > > > > > > > > > > > > on
>> >> > > > > > > > > > > > > > > > > > > > > > > an
>> >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory
>> >> Management
>> >> > > and
>> >> > > > > > > > > > Configuration
>> >> > > > > > > > > > > > > > > > > Reloaded"[2]
>> >> > > > > > > > > > > > > > > > > > by
>> >> > > > > > > > > > > > > > > > > > > > > > > Stephan,
>> >> > > > > > > > > > > > > > > > > > > > > > > > with updates from
>> follow-up
>> >> > > > > discussions
>> >> > > > > > > > both
>> >> > > > > > > > > > > online
>> >> > > > > > > > > > > > > and
>> >> > > > > > > > > > > > > > > > > > offline.
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses
>> several
>> >> > > > > > shortcomings
>> >> > > > > > > of
>> >> > > > > > > > > > > current
>> >> > > > > > > > > > > > > > > (Flink
>> >> > > > > > > > > > > > > > > > > 1.9)
>> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory
>> >> > > configuration.
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >    - Different
>> configuration
>> >> > for
>> >> > > > > > > Streaming
>> >> > > > > > > > > and
>> >> > > > > > > > > > > > Batch.
>> >> > > > > > > > > > > > > > > > > > > > > > > >    - Complex and
>> difficult
>> >> > > > > > configuration
>> >> > > > > > > of
>> >> > > > > > > > > > > RocksDB
>> >> > > > > > > > > > > > > in
>> >> > > > > > > > > > > > > > > > > > Streaming.
>> >> > > > > > > > > > > > > > > > > > > > > > > >    - Complicated,
>> uncertain
>> >> and
>> >> > > > hard
>> >> > > > > to
>> >> > > > > > > > > > > understand.
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the
>> >> > problems
>> >> > > > can
>> >> > > > > > be
>> >> > > > > > > > > > > summarized
>> >> > > > > > > > > > > > > as
>> >> > > > > > > > > > > > > > > > > follows.
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >    - Extend memory
>> manager
>> >> to
>> >> > > also
>> >> > > > > > > account
>> >> > > > > > > > > for
>> >> > > > > > > > > > > > memory
>> >> > > > > > > > > > > > > > > usage
>> >> > > > > > > > > > > > > > > > > by
>> >> > > > > > > > > > > > > > > > > > > > state
>> >> > > > > > > > > > > > > > > > > > > > > > > >    backends.
>> >> > > > > > > > > > > > > > > > > > > > > > > >    - Modify how
>> TaskExecutor
>> >> > > memory
>> >> > > > > is
>> >> > > > > > > > > > > partitioned
>> >> > > > > > > > > > > > > > > > accounted
>> >> > > > > > > > > > > > > > > > > > > > > individual
>> >> > > > > > > > > > > > > > > > > > > > > > > >    memory reservations
>> and
>> >> > pools.
>> >> > > > > > > > > > > > > > > > > > > > > > > >    - Simplify memory
>> >> > > configuration
>> >> > > > > > > options
>> >> > > > > > > > > and
>> >> > > > > > > > > > > > > > > calculations
>> >> > > > > > > > > > > > > > > > > > > logics.
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > Please find more details
>> in
>> >> the
>> >> > > > FLIP
>> >> > > > > > wiki
>> >> > > > > > > > > > > document
>> >> > > > > > > > > > > > > [1].
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the
>> early
>> >> > > design
>> >> > > > > doc
>> >> > > > > > > [2]
>> >> > > > > > > > is
>> >> > > > > > > > > > out
>> >> > > > > > > > > > > > of
>> >> > > > > > > > > > > > > > > sync,
>> >> > > > > > > > > > > > > > > > > and
>> >> > > > > > > > > > > > > > > > > > it
>> >> > > > > > > > > > > > > > > > > > > > is
>> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the
>> >> > > discussion
>> >> > > > in
>> >> > > > > > > this
>> >> > > > > > > > > > > mailing
>> >> > > > > > > > > > > > > list
>> >> > > > > > > > > > > > > > > > > > thread.)
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your
>> >> > > feedbacks.
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > [1]
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > >
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > >
>> >> > > > > > > > > >
>> >> > > > > > > > >
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > > [2]
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > >
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > >
>> >> > > > > > > > > >
>> >> > > > > > > > >
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>> https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing
>> >> > > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > > >
>> >> > > > > > > > > > > > > >
>> >> > > > > > > > > > > > >
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > >
>> >> > > > > > > > > >
>> >> > > > > > > > >
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>> >
>>
>

Re: [DISCUSS] FLIP-49: Unified Memory Configuration for TaskExecutors

Reply via email to