I also agree that all the configuration should be calculated out of TaskManager.
So a full configuration should be generated before TaskManager started. Override the calculated configurations through -D now seems better. Best, Yang Xintong Song <tonysong...@gmail.com> 于2019年9月2日周一 上午11:39写道: > I just updated the FLIP wiki page [1], with the following changes: > > - Network memory uses JVM direct memory, and is accounted when setting > JVM max direct memory size parameter. > - Use dynamic configurations (`-Dkey=value`) to pass calculated memory > configs into TaskExecutors, instead of ENV variables. > - Remove 'supporting memory reservation' from the scope of this FLIP. > > @till @stephan, please take another look see if there are any other > concerns. > > Thank you~ > > Xintong Song > > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > On Mon, Sep 2, 2019 at 11:13 AM Xintong Song <tonysong...@gmail.com> > wrote: > > > Sorry for the late response. > > > > - Regarding the `TaskExecutorSpecifics` naming, let's discuss the detail > > in PR. > > - Regarding passing parameters into the `TaskExecutor`, +1 for using > > dynamic configuration at the moment, given that there are more questions > to > > be discussed to have a general framework for overwriting configurations > > with ENV variables. > > - Regarding memory reservation, I double checked with Yu and he will take > > care of it. > > > > Thank you~ > > > > Xintong Song > > > > > > > > On Thu, Aug 29, 2019 at 7:35 PM Till Rohrmann <trohrm...@apache.org> > > wrote: > > > >> What I forgot to add is that we could tackle specifying the > configuration > >> fully in an incremental way and that the full specification should be > the > >> desired end state. > >> > >> On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann <trohrm...@apache.org> > >> wrote: > >> > >> > I think our goal should be that the configuration is fully specified > >> when > >> > the process is started. By considering the internal calculation step > to > >> be > >> > rather validate existing values and calculate missing ones, these two > >> > proposal shouldn't even conflict (given determinism). > >> > > >> > Since we don't want to change an existing flink-conf.yaml, specifying > >> the > >> > full configuration would require to pass in the options differently. > >> > > >> > One way could be the ENV variables approach. The reason why I'm trying > >> to > >> > exclude this feature from the FLIP is that I believe it needs a bit > more > >> > discussion. Just some questions which come to my mind: What would be > the > >> > exact format (FLINK_KEY_NAME)? Would we support a dot separator which > is > >> > supported by some systems (FLINK.KEY.NAME)? If we accept the dot > >> > separator what would be the order of precedence if there are two ENV > >> > variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? What is the > >> > precedence of env variable vs. dynamic configuration value specified > >> via -D? > >> > > >> > Another approach could be to pass in the dynamic configuration values > >> via > >> > `-Dkey=value` to the Flink process. For that we don't have to change > >> > anything because the functionality already exists. > >> > > >> > Cheers, > >> > Till > >> > > >> > On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen <se...@apache.org> > wrote: > >> > > >> >> I see. Under the assumption of strict determinism that should work. > >> >> > >> >> The original proposal had this point "don't compute inside the TM, > >> compute > >> >> outside and supply a full config", because that sounded more > intuitive. > >> >> > >> >> On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann <trohrm...@apache.org > > > >> >> wrote: > >> >> > >> >> > My understanding was that before starting the Flink process we > call a > >> >> > utility which calculates these values. I assume that this utility > >> will > >> >> do > >> >> > the calculation based on a set of configured values (process > memory, > >> >> flink > >> >> > memory, network memory etc.). Assuming that these values don't > differ > >> >> from > >> >> > the values with which the JVM is started, it should be possible to > >> >> > recompute them in the Flink process in order to set the values. > >> >> > > >> >> > > >> >> > > >> >> > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen <se...@apache.org> > >> wrote: > >> >> > > >> >> > > When computing the values in the JVM process after it started, > how > >> >> would > >> >> > > you deal with values like Max Direct Memory, Metaspace size. > native > >> >> > memory > >> >> > > reservation (reduce heap size), etc? All the values that are > >> >> parameters > >> >> > to > >> >> > > the JVM process and that need to be supplied at process startup? > >> >> > > > >> >> > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann < > >> trohrm...@apache.org> > >> >> > > wrote: > >> >> > > > >> >> > > > Thanks for the clarification. I have some more comments: > >> >> > > > > >> >> > > > - I would actually split the logic to compute the process > memory > >> >> > > > requirements and storing the values into two things. E.g. one > >> could > >> >> > name > >> >> > > > the former TaskExecutorProcessUtility and the latter > >> >> > > > TaskExecutorProcessMemory. But we can discuss this on the PR > >> since > >> >> it's > >> >> > > > just a naming detail. > >> >> > > > > >> >> > > > - Generally, I'm not opposed to making configuration values > >> >> overridable > >> >> > > by > >> >> > > > ENV variables. I think this is a very good idea and makes the > >> >> > > > configurability of Flink processes easier. However, I think > that > >> >> adding > >> >> > > > this functionality should not be part of this FLIP because it > >> would > >> >> > > simply > >> >> > > > widen the scope unnecessarily. > >> >> > > > > >> >> > > > The reasons why I believe it is unnecessary are the following: > >> For > >> >> Yarn > >> >> > > we > >> >> > > > already create write a flink-conf.yaml which could be populated > >> with > >> >> > the > >> >> > > > memory settings. For the other processes it should not make a > >> >> > difference > >> >> > > > whether the loaded Configuration is populated with the memory > >> >> settings > >> >> > > from > >> >> > > > ENV variables or by using TaskExecutorProcessUtility to compute > >> the > >> >> > > missing > >> >> > > > values from the loaded configuration. If the latter would not > be > >> >> > possible > >> >> > > > (wrong or missing configuration values), then we should not > have > >> >> been > >> >> > > able > >> >> > > > to actually start the process in the first place. > >> >> > > > > >> >> > > > - Concerning the memory reservation: I agree with you that we > >> need > >> >> the > >> >> > > > memory reservation functionality to make streaming jobs work > with > >> >> > > "managed" > >> >> > > > memory. However, w/o this functionality the whole Flip would > >> already > >> >> > > bring > >> >> > > > a good amount of improvements to our users when running batch > >> jobs. > >> >> > > > Moreover, by keeping the scope smaller we can complete the FLIP > >> >> faster. > >> >> > > > Hence, I would propose to address the memory reservation > >> >> functionality > >> >> > > as a > >> >> > > > follow up FLIP (which Yu is working on if I'm not mistaken). > >> >> > > > > >> >> > > > Cheers, > >> >> > > > Till > >> >> > > > > >> >> > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang < > >> danrtsey...@gmail.com> > >> >> > > wrote: > >> >> > > > > >> >> > > > > Just add my 2 cents. > >> >> > > > > > >> >> > > > > Using environment variables to override the configuration for > >> >> > different > >> >> > > > > taskmanagers is better. > >> >> > > > > We do not need to generate dedicated flink-conf.yaml for all > >> >> > > > taskmanagers. > >> >> > > > > A common flink-conf.yam and different environment variables > are > >> >> > enough. > >> >> > > > > By reducing the distributed cached files, it could make > >> launching > >> >> a > >> >> > > > > taskmanager faster. > >> >> > > > > > >> >> > > > > Stephan gives a good suggestion that we could move the logic > >> into > >> >> > > > > "GlobalConfiguration.loadConfig()" method. > >> >> > > > > Maybe the client could also benefit from this. Different > users > >> do > >> >> not > >> >> > > > have > >> >> > > > > to export FLINK_CONF_DIR to update few config options. > >> >> > > > > > >> >> > > > > > >> >> > > > > Best, > >> >> > > > > Yang > >> >> > > > > > >> >> > > > > Stephan Ewen <se...@apache.org> 于2019年8月28日周三 上午1:21写道: > >> >> > > > > > >> >> > > > > > One note on the Environment Variables and Configuration > >> >> discussion. > >> >> > > > > > > >> >> > > > > > My understanding is that passed ENV variables are added to > >> the > >> >> > > > > > configuration in the "GlobalConfiguration.loadConfig()" > >> method > >> >> (or > >> >> > > > > > similar). > >> >> > > > > > For all the code inside Flink, it looks like the data was > in > >> the > >> >> > > config > >> >> > > > > to > >> >> > > > > > start with, just that the scripts that compute the > variables > >> can > >> >> > pass > >> >> > > > the > >> >> > > > > > values to the process without actually needing to write a > >> file. > >> >> > > > > > > >> >> > > > > > For example the "GlobalConfiguration.loadConfig()" method > >> would > >> >> > take > >> >> > > > any > >> >> > > > > > ENV variable prefixed with "flink" and add it as a config > >> key. > >> >> > > > > > "flink_taskmanager_memory_size=2g" would become > >> >> > > > "taskmanager.memory.size: > >> >> > > > > > 2g". > >> >> > > > > > > >> >> > > > > > > >> >> > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song < > >> >> > tonysong...@gmail.com> > >> >> > > > > > wrote: > >> >> > > > > > > >> >> > > > > > > Thanks for the comments, Till. > >> >> > > > > > > > >> >> > > > > > > I've also seen your comments on the wiki page, but let's > >> keep > >> >> the > >> >> > > > > > > discussion here. > >> >> > > > > > > > >> >> > > > > > > - Regarding 'TaskExecutorSpecifics', how do you think > about > >> >> > naming > >> >> > > it > >> >> > > > > > > 'TaskExecutorResourceSpecifics'. > >> >> > > > > > > - Regarding passing memory configurations into task > >> executors, > >> >> > I'm > >> >> > > in > >> >> > > > > > favor > >> >> > > > > > > of do it via environment variables rather than > >> configurations, > >> >> > with > >> >> > > > the > >> >> > > > > > > following two reasons. > >> >> > > > > > > - It is easier to keep the memory options once > calculate > >> >> not to > >> >> > > be > >> >> > > > > > > changed with environment variables rather than > >> configurations. > >> >> > > > > > > - I'm not sure whether we should write the > configuration > >> in > >> >> > > startup > >> >> > > > > > > scripts. Writing changes into the configuration files > when > >> >> > running > >> >> > > > the > >> >> > > > > > > startup scripts does not sounds right to me. Or we could > >> make > >> >> a > >> >> > > copy > >> >> > > > of > >> >> > > > > > > configuration files per flink cluster, and make the task > >> >> executor > >> >> > > to > >> >> > > > > load > >> >> > > > > > > from the copy, and clean up the copy after the cluster is > >> >> > shutdown, > >> >> > > > > which > >> >> > > > > > > is complicated. (I think this is also what Stephan means > in > >> >> his > >> >> > > > comment > >> >> > > > > > on > >> >> > > > > > > the wiki page?) > >> >> > > > > > > - Regarding reserving memory, I think this change should > be > >> >> > > included > >> >> > > > in > >> >> > > > > > > this FLIP. I think a big part of motivations of this FLIP > >> is > >> >> to > >> >> > > unify > >> >> > > > > > > memory configuration for streaming / batch and make it > easy > >> >> for > >> >> > > > > > configuring > >> >> > > > > > > rocksdb memory. If we don't support memory reservation, > >> then > >> >> > > > streaming > >> >> > > > > > jobs > >> >> > > > > > > cannot use managed memory (neither on-heap or off-heap), > >> which > >> >> > > makes > >> >> > > > > this > >> >> > > > > > > FLIP incomplete. > >> >> > > > > > > - Regarding network memory, I think you are right. I > think > >> we > >> >> > > > probably > >> >> > > > > > > don't need to change network stack from using direct > >> memory to > >> >> > > using > >> >> > > > > > unsafe > >> >> > > > > > > native memory. Network memory size is deterministic, > >> cannot be > >> >> > > > reserved > >> >> > > > > > as > >> >> > > > > > > managed memory does, and cannot be overused. I think it > >> also > >> >> > works > >> >> > > if > >> >> > > > > we > >> >> > > > > > > simply keep using direct memory for network and include > it > >> in > >> >> jvm > >> >> > > max > >> >> > > > > > > direct memory size. > >> >> > > > > > > > >> >> > > > > > > Thank you~ > >> >> > > > > > > > >> >> > > > > > > Xintong Song > >> >> > > > > > > > >> >> > > > > > > > >> >> > > > > > > > >> >> > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann < > >> >> > > trohrm...@apache.org> > >> >> > > > > > > wrote: > >> >> > > > > > > > >> >> > > > > > > > Hi Xintong, > >> >> > > > > > > > > >> >> > > > > > > > thanks for addressing the comments and adding a more > >> >> detailed > >> >> > > > > > > > implementation plan. I have a couple of comments > >> concerning > >> >> the > >> >> > > > > > > > implementation plan: > >> >> > > > > > > > > >> >> > > > > > > > - The name `TaskExecutorSpecifics` is not really > >> >> descriptive. > >> >> > > > > Choosing > >> >> > > > > > a > >> >> > > > > > > > different name could help here. > >> >> > > > > > > > - I'm not sure whether I would pass the memory > >> >> configuration to > >> >> > > the > >> >> > > > > > > > TaskExecutor via environment variables. I think it > would > >> be > >> >> > > better > >> >> > > > to > >> >> > > > > > > write > >> >> > > > > > > > it into the configuration one uses to start the TM > >> process. > >> >> > > > > > > > - If possible, I would exclude the memory reservation > >> from > >> >> this > >> >> > > > FLIP > >> >> > > > > > and > >> >> > > > > > > > add this as part of a dedicated FLIP. > >> >> > > > > > > > - If possible, then I would exclude changes to the > >> network > >> >> > stack > >> >> > > > from > >> >> > > > > > > this > >> >> > > > > > > > FLIP. Maybe we can simply say that the direct memory > >> needed > >> >> by > >> >> > > the > >> >> > > > > > > network > >> >> > > > > > > > stack is the framework direct memory requirement. > >> Changing > >> >> how > >> >> > > the > >> >> > > > > > memory > >> >> > > > > > > > is allocated can happen in a second step. This would > keep > >> >> the > >> >> > > scope > >> >> > > > > of > >> >> > > > > > > this > >> >> > > > > > > > FLIP smaller. > >> >> > > > > > > > > >> >> > > > > > > > Cheers, > >> >> > > > > > > > Till > >> >> > > > > > > > > >> >> > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song < > >> >> > > > tonysong...@gmail.com> > >> >> > > > > > > > wrote: > >> >> > > > > > > > > >> >> > > > > > > > > Hi everyone, > >> >> > > > > > > > > > >> >> > > > > > > > > I just updated the FLIP document on wiki [1], with > the > >> >> > > following > >> >> > > > > > > changes. > >> >> > > > > > > > > > >> >> > > > > > > > > - Removed open question regarding MemorySegment > >> >> > allocation. > >> >> > > As > >> >> > > > > > > > > discussed, we exclude this topic from the scope of > >> this > >> >> > > FLIP. > >> >> > > > > > > > > - Updated content about JVM direct memory > parameter > >> >> > > according > >> >> > > > to > >> >> > > > > > > > recent > >> >> > > > > > > > > discussions, and moved the other options to > >> "Rejected > >> >> > > > > > Alternatives" > >> >> > > > > > > > for > >> >> > > > > > > > > the > >> >> > > > > > > > > moment. > >> >> > > > > > > > > - Added implementation steps. > >> >> > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > Thank you~ > >> >> > > > > > > > > > >> >> > > > > > > > > Xintong Song > >> >> > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > [1] > >> >> > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > >> >> > > > > > > > > > >> >> > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen < > >> >> > se...@apache.org > >> >> > > > > >> >> > > > > > wrote: > >> >> > > > > > > > > > >> >> > > > > > > > > > @Xintong: Concerning "wait for memory users before > >> task > >> >> > > dispose > >> >> > > > > and > >> >> > > > > > > > > memory > >> >> > > > > > > > > > release": I agree, that's how it should be. Let's > >> try it > >> >> > out. > >> >> > > > > > > > > > > >> >> > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not wait > >> for > >> >> GC > >> >> > > when > >> >> > > > > > > > allocating > >> >> > > > > > > > > > direct memory buffer": There seems to be pretty > >> >> elaborate > >> >> > > logic > >> >> > > > > to > >> >> > > > > > > free > >> >> > > > > > > > > > buffers when allocating new ones. See > >> >> > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643 > >> >> > > > > > > > > > > >> >> > > > > > > > > > @Till: Maybe. If we assume that the JVM default > works > >> >> (like > >> >> > > > going > >> >> > > > > > > with > >> >> > > > > > > > > > option 2 and not setting "-XX:MaxDirectMemorySize" > at > >> >> all), > >> >> > > > then > >> >> > > > > I > >> >> > > > > > > > think > >> >> > > > > > > > > it > >> >> > > > > > > > > > should be okay to set "-XX:MaxDirectMemorySize" to > >> >> > > > > > > > > > "off_heap_managed_memory + direct_memory" even if > we > >> use > >> >> > > > RocksDB. > >> >> > > > > > > That > >> >> > > > > > > > > is a > >> >> > > > > > > > > > big if, though, I honestly have no idea :D Would be > >> >> good to > >> >> > > > > > > understand > >> >> > > > > > > > > > this, though, because this would affect option (2) > >> and > >> >> > option > >> >> > > > > > (1.2). > >> >> > > > > > > > > > > >> >> > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song < > >> >> > > > > > tonysong...@gmail.com> > >> >> > > > > > > > > > wrote: > >> >> > > > > > > > > > > >> >> > > > > > > > > > > Thanks for the inputs, Jingsong. > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > Let me try to summarize your points. Please > correct > >> >> me if > >> >> > > I'm > >> >> > > > > > > wrong. > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > - Memory consumers should always avoid > returning > >> >> > memory > >> >> > > > > > segments > >> >> > > > > > > > to > >> >> > > > > > > > > > > memory manager while there are still > un-cleaned > >> >> > > > structures / > >> >> > > > > > > > threads > >> >> > > > > > > > > > > that > >> >> > > > > > > > > > > may use the memory. Otherwise, it would cause > >> >> serious > >> >> > > > > problems > >> >> > > > > > > by > >> >> > > > > > > > > > having > >> >> > > > > > > > > > > multiple consumers trying to use the same > memory > >> >> > > segment. > >> >> > > > > > > > > > > - JVM does not wait for GC when allocating > >> direct > >> >> > memory > >> >> > > > > > buffer. > >> >> > > > > > > > > > > Therefore even we set proper max direct memory > >> size > >> >> > > limit, > >> >> > > > > we > >> >> > > > > > > may > >> >> > > > > > > > > > still > >> >> > > > > > > > > > > encounter direct memory oom if the GC cleaning > >> >> memory > >> >> > > > slower > >> >> > > > > > > than > >> >> > > > > > > > > the > >> >> > > > > > > > > > > direct memory allocation. > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > Am I understanding this correctly? > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee < > >> >> > > > > > > lzljs3620...@aliyun.com > >> >> > > > > > > > > > > .invalid> > >> >> > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > > Hi stephan: > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > About option 2: > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > if additional threads not cleanly shut down > >> before > >> >> we > >> >> > can > >> >> > > > > exit > >> >> > > > > > > the > >> >> > > > > > > > > > task: > >> >> > > > > > > > > > > > In the current case of memory reuse, it has > >> freed up > >> >> > the > >> >> > > > > memory > >> >> > > > > > > it > >> >> > > > > > > > > > > > uses. If this memory is used by other tasks > and > >> >> > > > asynchronous > >> >> > > > > > > > threads > >> >> > > > > > > > > > > > of exited task may still be writing, there > will > >> be > >> >> > > > > concurrent > >> >> > > > > > > > > security > >> >> > > > > > > > > > > > problems, and even lead to errors in user > >> computing > >> >> > > > results. > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > So I think this is a serious and intolerable > >> bug, No > >> >> > > matter > >> >> > > > > > what > >> >> > > > > > > > the > >> >> > > > > > > > > > > > option is, it should be avoided. > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > About direct memory cleaned by GC: > >> >> > > > > > > > > > > > I don't think it is a good idea, I've > >> encountered so > >> >> > many > >> >> > > > > > > > situations > >> >> > > > > > > > > > > > that it's too late for GC to cause > DirectMemory > >> >> OOM. > >> >> > > > Release > >> >> > > > > > and > >> >> > > > > > > > > > > > allocate DirectMemory depend on the type of > user > >> >> job, > >> >> > > > which > >> >> > > > > is > >> >> > > > > > > > > > > > often beyond our control. > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > Best, > >> >> > > > > > > > > > > > Jingsong Lee > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > >> >> > ------------------------------------------------------------------ > >> >> > > > > > > > > > > > From:Stephan Ewen <se...@apache.org> > >> >> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56 > >> >> > > > > > > > > > > > To:dev <dev@flink.apache.org> > >> >> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified Memory > >> >> > > Configuration > >> >> > > > > for > >> >> > > > > > > > > > > > TaskExecutors > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > My main concern with option 2 (manually release > >> >> memory) > >> >> > > is > >> >> > > > > that > >> >> > > > > > > > > > segfaults > >> >> > > > > > > > > > > > in the JVM send off all sorts of alarms on user > >> >> ends. > >> >> > So > >> >> > > we > >> >> > > > > > need > >> >> > > > > > > to > >> >> > > > > > > > > > > > guarantee that this never happens. > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > The trickyness is in tasks that uses data > >> >> structures / > >> >> > > > > > algorithms > >> >> > > > > > > > > with > >> >> > > > > > > > > > > > additional threads, like hash table spill/read > >> and > >> >> > > sorting > >> >> > > > > > > threads. > >> >> > > > > > > > > We > >> >> > > > > > > > > > > need > >> >> > > > > > > > > > > > to ensure that these cleanly shut down before > we > >> can > >> >> > exit > >> >> > > > the > >> >> > > > > > > task. > >> >> > > > > > > > > > > > I am not sure that we have that guaranteed > >> already, > >> >> > > that's > >> >> > > > > why > >> >> > > > > > > > option > >> >> > > > > > > > > > 1.1 > >> >> > > > > > > > > > > > seemed simpler to me. > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong Song < > >> >> > > > > > > > tonysong...@gmail.com> > >> >> > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > Thanks for the comments, Stephan. Summarized > in > >> >> this > >> >> > > way > >> >> > > > > > really > >> >> > > > > > > > > makes > >> >> > > > > > > > > > > > > things easier to understand. > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > I'm in favor of option 2, at least for the > >> >> moment. I > >> >> > > > think > >> >> > > > > it > >> >> > > > > > > is > >> >> > > > > > > > > not > >> >> > > > > > > > > > > that > >> >> > > > > > > > > > > > > difficult to keep it segfault safe for memory > >> >> > manager, > >> >> > > as > >> >> > > > > > long > >> >> > > > > > > as > >> >> > > > > > > > > we > >> >> > > > > > > > > > > > always > >> >> > > > > > > > > > > > > de-allocate the memory segment when it is > >> released > >> >> > from > >> >> > > > the > >> >> > > > > > > > memory > >> >> > > > > > > > > > > > > consumers. Only if the memory consumer > continue > >> >> using > >> >> > > the > >> >> > > > > > > buffer > >> >> > > > > > > > of > >> >> > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > segment after releasing it, in which case we > do > >> >> want > >> >> > > the > >> >> > > > > job > >> >> > > > > > to > >> >> > > > > > > > > fail > >> >> > > > > > > > > > so > >> >> > > > > > > > > > > > we > >> >> > > > > > > > > > > > > detect the memory leak early. > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > For option 1.2, I don't think this is a good > >> idea. > >> >> > Not > >> >> > > > only > >> >> > > > > > > > because > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > > assumption (regular GC is enough to clean > >> direct > >> >> > > buffers) > >> >> > > > > may > >> >> > > > > > > not > >> >> > > > > > > > > > > always > >> >> > > > > > > > > > > > be > >> >> > > > > > > > > > > > > true, but also it makes harder for finding > >> >> problems > >> >> > in > >> >> > > > > cases > >> >> > > > > > of > >> >> > > > > > > > > > memory > >> >> > > > > > > > > > > > > overuse. E.g., user configured some direct > >> memory > >> >> for > >> >> > > the > >> >> > > > > > user > >> >> > > > > > > > > > > libraries. > >> >> > > > > > > > > > > > > If the library actually use more direct > memory > >> >> then > >> >> > > > > > configured, > >> >> > > > > > > > > which > >> >> > > > > > > > > > > > > cannot be cleaned by GC because they are > still > >> in > >> >> > use, > >> >> > > > may > >> >> > > > > > lead > >> >> > > > > > > > to > >> >> > > > > > > > > > > > overuse > >> >> > > > > > > > > > > > > of the total container memory. In that case, > >> if it > >> >> > > didn't > >> >> > > > > > touch > >> >> > > > > > > > the > >> >> > > > > > > > > > JVM > >> >> > > > > > > > > > > > > default max direct memory limit, we cannot > get > >> a > >> >> > direct > >> >> > > > > > memory > >> >> > > > > > > > OOM > >> >> > > > > > > > > > and > >> >> > > > > > > > > > > it > >> >> > > > > > > > > > > > > will become super hard to understand which > >> part of > >> >> > the > >> >> > > > > > > > > configuration > >> >> > > > > > > > > > > need > >> >> > > > > > > > > > > > > to be updated. > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > For option 1.1, it has the similar problem as > >> >> 1.2, if > >> >> > > the > >> >> > > > > > > > exceeded > >> >> > > > > > > > > > > direct > >> >> > > > > > > > > > > > > memory does not reach the max direct memory > >> limit > >> >> > > > specified > >> >> > > > > > by > >> >> > > > > > > > the > >> >> > > > > > > > > > > > > dedicated parameter. I think it is slightly > >> better > >> >> > than > >> >> > > > > 1.2, > >> >> > > > > > > only > >> >> > > > > > > > > > > because > >> >> > > > > > > > > > > > > we can tune the parameter. > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan Ewen > < > >> >> > > > > > se...@apache.org > >> >> > > > > > > > > >> >> > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" > >> discussion, > >> >> > maybe > >> >> > > > let > >> >> > > > > > me > >> >> > > > > > > > > > > summarize > >> >> > > > > > > > > > > > > it a > >> >> > > > > > > > > > > > > > bit differently: > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > We have the following two options: > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > (1) We let MemorySegments be de-allocated > by > >> the > >> >> > GC. > >> >> > > > That > >> >> > > > > > > makes > >> >> > > > > > > > > it > >> >> > > > > > > > > > > > > segfault > >> >> > > > > > > > > > > > > > safe. But then we need a way to trigger GC > in > >> >> case > >> >> > > > > > > > de-allocation > >> >> > > > > > > > > > and > >> >> > > > > > > > > > > > > > re-allocation of a bunch of segments > happens > >> >> > quickly, > >> >> > > > > which > >> >> > > > > > > is > >> >> > > > > > > > > > often > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > case during batch scheduling or task > restart. > >> >> > > > > > > > > > > > > > - The "-XX:MaxDirectMemorySize" (option > >> 1.1) > >> >> is > >> >> > one > >> >> > > > way > >> >> > > > > > to > >> >> > > > > > > do > >> >> > > > > > > > > > this > >> >> > > > > > > > > > > > > > - Another way could be to have a > dedicated > >> >> > > > bookkeeping > >> >> > > > > in > >> >> > > > > > > the > >> >> > > > > > > > > > > > > > MemoryManager (option 1.2), so that this > is a > >> >> > number > >> >> > > > > > > > independent > >> >> > > > > > > > > of > >> >> > > > > > > > > > > the > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter. > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > (2) We manually allocate and de-allocate > the > >> >> memory > >> >> > > for > >> >> > > > > the > >> >> > > > > > > > > > > > > MemorySegments > >> >> > > > > > > > > > > > > > (option 2). That way we need not worry > about > >> >> > > triggering > >> >> > > > > GC > >> >> > > > > > by > >> >> > > > > > > > > some > >> >> > > > > > > > > > > > > > threshold or bookkeeping, but it is harder > to > >> >> > prevent > >> >> > > > > > > > segfaults. > >> >> > > > > > > > > We > >> >> > > > > > > > > > > > need > >> >> > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > be very careful about when we release the > >> memory > >> >> > > > segments > >> >> > > > > > > (only > >> >> > > > > > > > > in > >> >> > > > > > > > > > > the > >> >> > > > > > > > > > > > > > cleanup phase of the main thread). > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > If we go with option 1.1, we probably need > to > >> >> set > >> >> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to > >> >> > > "off_heap_managed_memory + > >> >> > > > > > > > > > > direct_memory" > >> >> > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > have "direct_memory" as a separate reserved > >> >> memory > >> >> > > > pool. > >> >> > > > > > > > Because > >> >> > > > > > > > > if > >> >> > > > > > > > > > > we > >> >> > > > > > > > > > > > > just > >> >> > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to > >> >> > > > > "off_heap_managed_memory + > >> >> > > > > > > > > > > > > jvm_overhead", > >> >> > > > > > > > > > > > > > then there will be times when that entire > >> >> memory is > >> >> > > > > > allocated > >> >> > > > > > > > by > >> >> > > > > > > > > > > direct > >> >> > > > > > > > > > > > > > buffers and we have nothing left for the > JVM > >> >> > > overhead. > >> >> > > > So > >> >> > > > > > we > >> >> > > > > > > > > either > >> >> > > > > > > > > > > > need > >> >> > > > > > > > > > > > > a > >> >> > > > > > > > > > > > > > way to compensate for that (again some > safety > >> >> > margin > >> >> > > > > cutoff > >> >> > > > > > > > > value) > >> >> > > > > > > > > > or > >> >> > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > will exceed container memory. > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > If we go with option 1.2, we need to be > aware > >> >> that > >> >> > it > >> >> > > > > takes > >> >> > > > > > > > > > elaborate > >> >> > > > > > > > > > > > > logic > >> >> > > > > > > > > > > > > > to push recycling of direct buffers without > >> >> always > >> >> > > > > > > triggering a > >> >> > > > > > > > > > full > >> >> > > > > > > > > > > > GC. > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > My first guess is that the options will be > >> >> easiest > >> >> > to > >> >> > > > do > >> >> > > > > in > >> >> > > > > > > the > >> >> > > > > > > > > > > > following > >> >> > > > > > > > > > > > > > order: > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > - Option 1.1 with a dedicated > direct_memory > >> >> > > > parameter, > >> >> > > > > as > >> >> > > > > > > > > > discussed > >> >> > > > > > > > > > > > > > above. We would need to find a way to set > the > >> >> > > > > direct_memory > >> >> > > > > > > > > > parameter > >> >> > > > > > > > > > > > by > >> >> > > > > > > > > > > > > > default. We could start with 64 MB and see > >> how > >> >> it > >> >> > > goes > >> >> > > > in > >> >> > > > > > > > > practice. > >> >> > > > > > > > > > > One > >> >> > > > > > > > > > > > > > danger I see is that setting this loo low > can > >> >> > cause a > >> >> > > > > bunch > >> >> > > > > > > of > >> >> > > > > > > > > > > > additional > >> >> > > > > > > > > > > > > > GCs compared to before (we need to watch > this > >> >> > > > carefully). > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > - Option 2. It is actually quite simple > to > >> >> > > implement, > >> >> > > > > we > >> >> > > > > > > > could > >> >> > > > > > > > > > try > >> >> > > > > > > > > > > > how > >> >> > > > > > > > > > > > > > segfault safe we are at the moment. > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > - Option 1.2: We would not touch the > >> >> > > > > > > > "-XX:MaxDirectMemorySize" > >> >> > > > > > > > > > > > > parameter > >> >> > > > > > > > > > > > > > at all and assume that all the direct > memory > >> >> > > > allocations > >> >> > > > > > that > >> >> > > > > > > > the > >> >> > > > > > > > > > JVM > >> >> > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > Netty do are infrequent enough to be > cleaned > >> up > >> >> > fast > >> >> > > > > enough > >> >> > > > > > > > > through > >> >> > > > > > > > > > > > > regular > >> >> > > > > > > > > > > > > > GC. I am not sure if that is a valid > >> assumption, > >> >> > > > though. > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > Best, > >> >> > > > > > > > > > > > > > Stephan > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong > Song > >> < > >> >> > > > > > > > > > tonysong...@gmail.com> > >> >> > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was > >> >> > wondering > >> >> > > > > > whether > >> >> > > > > > > > we > >> >> > > > > > > > > > can > >> >> > > > > > > > > > > > > avoid > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap > >> managed > >> >> > memory > >> >> > > > and > >> >> > > > > > > > network > >> >> > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > with > >> >> > > > > > > > > > > > > > > alternative 3. But after giving it a > second > >> >> > > thought, > >> >> > > > I > >> >> > > > > > > think > >> >> > > > > > > > > even > >> >> > > > > > > > > > > for > >> >> > > > > > > > > > > > > > > alternative 3 using direct memory for > >> off-heap > >> >> > > > managed > >> >> > > > > > > memory > >> >> > > > > > > > > > could > >> >> > > > > > > > > > > > > cause > >> >> > > > > > > > > > > > > > > problems. > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Hi Yang, > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Regarding your concern, I think what > >> proposed > >> >> in > >> >> > > this > >> >> > > > > > FLIP > >> >> > > > > > > it > >> >> > > > > > > > > to > >> >> > > > > > > > > > > have > >> >> > > > > > > > > > > > > > both > >> >> > > > > > > > > > > > > > > off-heap managed memory and network > memory > >> >> > > allocated > >> >> > > > > > > through > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are > >> >> > practically > >> >> > > > > > native > >> >> > > > > > > > > memory > >> >> > > > > > > > > > > and > >> >> > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > limited by JVM max direct memory. The > only > >> >> parts > >> >> > of > >> >> > > > > > memory > >> >> > > > > > > > > > limited > >> >> > > > > > > > > > > by > >> >> > > > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > max direct memory are task off-heap > memory > >> and > >> >> > JVM > >> >> > > > > > > overhead, > >> >> > > > > > > > > > which > >> >> > > > > > > > > > > > are > >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set the > >> JVM > >> >> max > >> >> > > > > direct > >> >> > > > > > > > memory > >> >> > > > > > > > > > to. > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till > >> Rohrmann > >> >> < > >> >> > > > > > > > > > > trohrm...@apache.org> > >> >> > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I > >> >> > > understand > >> >> > > > > the > >> >> > > > > > > two > >> >> > > > > > > > > > > > > alternatives > >> >> > > > > > > > > > > > > > > > now. > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > I would be in favour of option 2 > because > >> it > >> >> > makes > >> >> > > > > > things > >> >> > > > > > > > > > > explicit. > >> >> > > > > > > > > > > > If > >> >> > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > don't limit the direct memory, I fear > >> that > >> >> we > >> >> > > might > >> >> > > > > end > >> >> > > > > > > up > >> >> > > > > > > > > in a > >> >> > > > > > > > > > > > > similar > >> >> > > > > > > > > > > > > > > > situation as we are currently in: The > >> user > >> >> > might > >> >> > > > see > >> >> > > > > > that > >> >> > > > > > > > her > >> >> > > > > > > > > > > > process > >> >> > > > > > > > > > > > > > > gets > >> >> > > > > > > > > > > > > > > > killed by the OS and does not know why > >> this > >> >> is > >> >> > > the > >> >> > > > > > case. > >> >> > > > > > > > > > > > > Consequently, > >> >> > > > > > > > > > > > > > > she > >> >> > > > > > > > > > > > > > > > tries to decrease the process memory > size > >> >> > > (similar > >> >> > > > to > >> >> > > > > > > > > > increasing > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > cutoff > >> >> > > > > > > > > > > > > > > > ratio) in order to accommodate for the > >> extra > >> >> > > direct > >> >> > > > > > > memory. > >> >> > > > > > > > > > Even > >> >> > > > > > > > > > > > > worse, > >> >> > > > > > > > > > > > > > > she > >> >> > > > > > > > > > > > > > > > tries to decrease memory budgets which > >> are > >> >> not > >> >> > > > fully > >> >> > > > > > used > >> >> > > > > > > > and > >> >> > > > > > > > > > > hence > >> >> > > > > > > > > > > > > > won't > >> >> > > > > > > > > > > > > > > > change the overall memory consumption. > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > Cheers, > >> >> > > > > > > > > > > > > > > > Till > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM > Xintong > >> >> Song < > >> >> > > > > > > > > > > > tonysong...@gmail.com > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Let me explain this with a concrete > >> >> example > >> >> > > Till. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Let's say we have the following > >> scenario. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap > >> Memory + > >> >> JVM > >> >> > > > > > > Overhead): > >> >> > > > > > > > > > 200MB > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM > >> >> Metaspace, > >> >> > > > > > Off-Heap > >> >> > > > > > > > > > Managed > >> >> > > > > > > > > > > > > Memory > >> >> > > > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set > >> >> > > -XX:MaxDirectMemorySize > >> >> > > > > to > >> >> > > > > > > > 200MB. > >> >> > > > > > > > > > > > > > > > > For alternative 3, we set > >> >> > > -XX:MaxDirectMemorySize > >> >> > > > > to > >> >> > > > > > a > >> >> > > > > > > > very > >> >> > > > > > > > > > > large > >> >> > > > > > > > > > > > > > > value, > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage of > >> Task > >> >> > > > Off-Heap > >> >> > > > > > > Memory > >> >> > > > > > > > > and > >> >> > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > > Overhead > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then > alternative 2 > >> >> and > >> >> > > > > > > alternative 3 > >> >> > > > > > > > > > > should > >> >> > > > > > > > > > > > > have > >> >> > > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > same utility. Setting larger > >> >> > > > > -XX:MaxDirectMemorySize > >> >> > > > > > > will > >> >> > > > > > > > > not > >> >> > > > > > > > > > > > > reduce > >> >> > > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > sizes of the other memory pools. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage of > >> Task > >> >> > > > Off-Heap > >> >> > > > > > > Memory > >> >> > > > > > > > > and > >> >> > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, > then > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from > >> frequent > >> >> OOM. > >> >> > > To > >> >> > > > > > avoid > >> >> > > > > > > > > that, > >> >> > > > > > > > > > > the > >> >> > > > > > > > > > > > > only > >> >> > > > > > > > > > > > > > > > thing > >> >> > > > > > > > > > > > > > > > > user can do is to modify the > >> >> configuration > >> >> > > and > >> >> > > > > > > > increase > >> >> > > > > > > > > > JVM > >> >> > > > > > > > > > > > > Direct > >> >> > > > > > > > > > > > > > > > > Memory > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM > >> Overhead). > >> >> > Let's > >> >> > > > say > >> >> > > > > > > that > >> >> > > > > > > > > user > >> >> > > > > > > > > > > > > > increases > >> >> > > > > > > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this will > >> >> reduce > >> >> > the > >> >> > > > > total > >> >> > > > > > > > size > >> >> > > > > > > > > of > >> >> > > > > > > > > > > > other > >> >> > > > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given the total > >> process > >> >> > > memory > >> >> > > > > > > remains > >> >> > > > > > > > > > 1GB. > >> >> > > > > > > > > > > > > > > > > - For alternative 3, there is no > >> >> chance of > >> >> > > > > direct > >> >> > > > > > > OOM. > >> >> > > > > > > > > > There > >> >> > > > > > > > > > > > are > >> >> > > > > > > > > > > > > > > > chances > >> >> > > > > > > > > > > > > > > > > of exceeding the total process > >> memory > >> >> > limit, > >> >> > > > but > >> >> > > > > > > given > >> >> > > > > > > > > > that > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > process > >> >> > > > > > > > > > > > > > > > > may > >> >> > > > > > > > > > > > > > > > > not use up all the reserved native > >> >> memory > >> >> > > > > > (Off-Heap > >> >> > > > > > > > > > Managed > >> >> > > > > > > > > > > > > > Memory, > >> >> > > > > > > > > > > > > > > > > Network > >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the > >> actual > >> >> > direct > >> >> > > > > > memory > >> >> > > > > > > > > usage > >> >> > > > > > > > > > is > >> >> > > > > > > > > > > > > > > slightly > >> >> > > > > > > > > > > > > > > > > above > >> >> > > > > > > > > > > > > > > > > yet very close to 200MB, user > >> probably > >> >> do > >> >> > > not > >> >> > > > > need > >> >> > > > > > > to > >> >> > > > > > > > > > change > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > configurations. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Therefore, I think from the user's > >> >> > > perspective, a > >> >> > > > > > > > feasible > >> >> > > > > > > > > > > > > > > configuration > >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower > >> >> resource > >> >> > > > > > > utilization > >> >> > > > > > > > > > > compared > >> >> > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > alternative 3. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till > >> >> > Rohrmann > >> >> > > < > >> >> > > > > > > > > > > > > trohrm...@apache.org > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > I guess you have to help me > >> understand > >> >> the > >> >> > > > > > difference > >> >> > > > > > > > > > between > >> >> > > > > > > > > > > > > > > > > alternative 2 > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under > utilization > >> >> > > Xintong. > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > >> >> XX:MaxDirectMemorySize > >> >> > > to > >> >> > > > > Task > >> >> > > > > > > > > > Off-Heap > >> >> > > > > > > > > > > > > Memory > >> >> > > > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk > that > >> >> this > >> >> > > size > >> >> > > > > is > >> >> > > > > > > too > >> >> > > > > > > > > low > >> >> > > > > > > > > > > > > > resulting > >> >> > > > > > > > > > > > > > > > in a > >> >> > > > > > > > > > > > > > > > > > lot of garbage collection and > >> >> potentially > >> >> > an > >> >> > > > OOM. > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > >> >> XX:MaxDirectMemorySize > >> >> > > to > >> >> > > > > > > > something > >> >> > > > > > > > > > > larger > >> >> > > > > > > > > > > > > > than > >> >> > > > > > > > > > > > > > > > > > alternative 2. This would of course > >> >> reduce > >> >> > > the > >> >> > > > > > sizes > >> >> > > > > > > of > >> >> > > > > > > > > the > >> >> > > > > > > > > > > > other > >> >> > > > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > types. > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 now result > >> in an > >> >> > > under > >> >> > > > > > > > > utilization > >> >> > > > > > > > > > of > >> >> > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > compared to alternative 3? If > >> >> alternative 3 > >> >> > > > > > strictly > >> >> > > > > > > > > sets a > >> >> > > > > > > > > > > > > higher > >> >> > > > > > > > > > > > > > > max > >> >> > > > > > > > > > > > > > > > > > direct memory size and we use only > >> >> little, > >> >> > > > then I > >> >> > > > > > > would > >> >> > > > > > > > > > > expect > >> >> > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > > > alternative 3 results in memory > under > >> >> > > > > utilization. > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > Cheers, > >> >> > > > > > > > > > > > > > > > > > Till > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM > Yang > >> >> Wang < > >> >> > > > > > > > > > > > danrtsey...@gmail.com > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > My point is setting a very large > >> max > >> >> > direct > >> >> > > > > > memory > >> >> > > > > > > > size > >> >> > > > > > > > > > > when > >> >> > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > do > >> >> > > > > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > > > > differentiate direct and native > >> >> memory. > >> >> > If > >> >> > > > the > >> >> > > > > > > direct > >> >> > > > > > > > > > > > > > > > memory,including > >> >> > > > > > > > > > > > > > > > > > user > >> >> > > > > > > > > > > > > > > > > > > direct memory and framework > direct > >> >> > > > memory,could > >> >> > > > > > be > >> >> > > > > > > > > > > calculated > >> >> > > > > > > > > > > > > > > > > > > correctly,then > >> >> > > > > > > > > > > > > > > > > > > i am in favor of setting direct > >> memory > >> >> > with > >> >> > > > > fixed > >> >> > > > > > > > > value. > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn > and > >> >> k8s,we > >> >> > > > need > >> >> > > > > to > >> >> > > > > > > > check > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > > configurations in client to avoid > >> >> > > submitting > >> >> > > > > > > > > successfully > >> >> > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > failing > >> >> > > > > > > > > > > > > > > > > in > >> >> > > > > > > > > > > > > > > > > > > the flink master. > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Best, > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Yang > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > >> tonysong...@gmail.com > >> >> > > > > >于2019年8月13日 > >> >> > > > > > > > > > 周二22:07写道: > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think > you > >> are > >> >> > > right > >> >> > > > > that > >> >> > > > > > > we > >> >> > > > > > > > > > should > >> >> > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > include > >> >> > > > > > > > > > > > > > > > > > > this > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope of this > FLIP. > >> >> This > >> >> > > FLIP > >> >> > > > > > should > >> >> > > > > > > > > > > > concentrate > >> >> > > > > > > > > > > > > > on > >> >> > > > > > > > > > > > > > > > how > >> >> > > > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > > > > configure memory pools for > >> >> > TaskExecutors, > >> >> > > > > with > >> >> > > > > > > > > minimum > >> >> > > > > > > > > > > > > > > involvement > >> >> > > > > > > > > > > > > > > > on > >> >> > > > > > > > > > > > > > > > > > how > >> >> > > > > > > > > > > > > > > > > > > > memory consumers use it. > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > About direct memory, I think > >> >> > alternative > >> >> > > 3 > >> >> > > > > may > >> >> > > > > > > not > >> >> > > > > > > > > > having > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > same > >> >> > > > > > > > > > > > > > > > > over > >> >> > > > > > > > > > > > > > > > > > > > reservation issue that > >> alternative 2 > >> >> > > does, > >> >> > > > > but > >> >> > > > > > at > >> >> > > > > > > > the > >> >> > > > > > > > > > > cost > >> >> > > > > > > > > > > > of > >> >> > > > > > > > > > > > > > > risk > >> >> > > > > > > > > > > > > > > > of > >> >> > > > > > > > > > > > > > > > > > > over > >> >> > > > > > > > > > > > > > > > > > > > using memory at the container > >> level, > >> >> > > which > >> >> > > > is > >> >> > > > > > not > >> >> > > > > > > > > good. > >> >> > > > > > > > > > > My > >> >> > > > > > > > > > > > > > point > >> >> > > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and > >> "JVM > >> >> > > > > Overhead" > >> >> > > > > > > are > >> >> > > > > > > > > not > >> >> > > > > > > > > > > easy > >> >> > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > config. > >> >> > > > > > > > > > > > > > > > > > > For > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users might > >> configure > >> >> > them > >> >> > > > > > higher > >> >> > > > > > > > than > >> >> > > > > > > > > > > what > >> >> > > > > > > > > > > > > > > actually > >> >> > > > > > > > > > > > > > > > > > > needed, > >> >> > > > > > > > > > > > > > > > > > > > just to avoid getting a direct > >> OOM. > >> >> For > >> >> > > > > > > alternative > >> >> > > > > > > > > 3, > >> >> > > > > > > > > > > > users > >> >> > > > > > > > > > > > > do > >> >> > > > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > > get > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not > >> config > >> >> the > >> >> > > two > >> >> > > > > > > options > >> >> > > > > > > > > > > > > aggressively > >> >> > > > > > > > > > > > > > > > high. > >> >> > > > > > > > > > > > > > > > > > But > >> >> > > > > > > > > > > > > > > > > > > > the consequences are risks of > >> >> overall > >> >> > > > > container > >> >> > > > > > > > > memory > >> >> > > > > > > > > > > > usage > >> >> > > > > > > > > > > > > > > > exceeds > >> >> > > > > > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > > > > budget. > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM > >> Till > >> >> > > > > Rohrmann < > >> >> > > > > > > > > > > > > > > > trohrm...@apache.org> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this > FLIP > >> >> > Xintong. > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think it already > >> >> looks > >> >> > > quite > >> >> > > > > > good. > >> >> > > > > > > > > > > > Concerning > >> >> > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > first > >> >> > > > > > > > > > > > > > > > > > > open > >> >> > > > > > > > > > > > > > > > > > > > > question about allocating > >> memory > >> >> > > > segments, > >> >> > > > > I > >> >> > > > > > > was > >> >> > > > > > > > > > > > wondering > >> >> > > > > > > > > > > > > > > > whether > >> >> > > > > > > > > > > > > > > > > > this > >> >> > > > > > > > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in > the > >> >> > context > >> >> > > > of > >> >> > > > > > this > >> >> > > > > > > > > FLIP > >> >> > > > > > > > > > or > >> >> > > > > > > > > > > > > > whether > >> >> > > > > > > > > > > > > > > > > this > >> >> > > > > > > > > > > > > > > > > > > > could > >> >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? > Without > >> >> > knowing > >> >> > > > all > >> >> > > > > > > > > details, > >> >> > > > > > > > > > I > >> >> > > > > > > > > > > > > would > >> >> > > > > > > > > > > > > > be > >> >> > > > > > > > > > > > > > > > > > > concerned > >> >> > > > > > > > > > > > > > > > > > > > > that we would widen the scope > >> of > >> >> this > >> >> > > > FLIP > >> >> > > > > > too > >> >> > > > > > > > much > >> >> > > > > > > > > > > > because > >> >> > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > > would > >> >> > > > > > > > > > > > > > > > > > > have > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the existing > call > >> >> sites > >> >> > of > >> >> > > > the > >> >> > > > > > > > > > > MemoryManager > >> >> > > > > > > > > > > > > > where > >> >> > > > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > > > > > allocate > >> >> > > > > > > > > > > > > > > > > > > > > memory segments (this should > >> >> mainly > >> >> > be > >> >> > > > > batch > >> >> > > > > > > > > > > operators). > >> >> > > > > > > > > > > > > The > >> >> > > > > > > > > > > > > > > > > addition > >> >> > > > > > > > > > > > > > > > > > > of > >> >> > > > > > > > > > > > > > > > > > > > > the memory reservation call > to > >> the > >> >> > > > > > > MemoryManager > >> >> > > > > > > > > > should > >> >> > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > be > >> >> > > > > > > > > > > > > > > > > > affected > >> >> > > > > > > > > > > > > > > > > > > > by > >> >> > > > > > > > > > > > > > > > > > > > > this and I would hope that > >> this is > >> >> > the > >> >> > > > only > >> >> > > > > > > point > >> >> > > > > > > > > of > >> >> > > > > > > > > > > > > > > interaction > >> >> > > > > > > > > > > > > > > > a > >> >> > > > > > > > > > > > > > > > > > > > > streaming job would have with > >> the > >> >> > > > > > > MemoryManager. > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the second open > >> >> question > >> >> > > about > >> >> > > > > > > setting > >> >> > > > > > > > > or > >> >> > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > setting > >> >> > > > > > > > > > > > > > > > a > >> >> > > > > > > > > > > > > > > > > > max > >> >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I would > >> also > >> >> be > >> >> > > > > > interested > >> >> > > > > > > > why > >> >> > > > > > > > > > > Yang > >> >> > > > > > > > > > > > > Wang > >> >> > > > > > > > > > > > > > > > > thinks > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open would be > best. > >> My > >> >> > > concern > >> >> > > > > > about > >> >> > > > > > > > > this > >> >> > > > > > > > > > > > would > >> >> > > > > > > > > > > > > be > >> >> > > > > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > > > > > would > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar situation as > we > >> >> are > >> >> > now > >> >> > > > > with > >> >> > > > > > > the > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > >> >> > > > > > > > > > > > > > > > > > > If > >> >> > > > > > > > > > > > > > > > > > > > > the different memory pools > are > >> not > >> >> > > > clearly > >> >> > > > > > > > > separated > >> >> > > > > > > > > > > and > >> >> > > > > > > > > > > > > can > >> >> > > > > > > > > > > > > > > > spill > >> >> > > > > > > > > > > > > > > > > > over > >> >> > > > > > > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > > > > > a different pool, then it is > >> quite > >> >> > hard > >> >> > > > to > >> >> > > > > > > > > understand > >> >> > > > > > > > > > > > what > >> >> > > > > > > > > > > > > > > > exactly > >> >> > > > > > > > > > > > > > > > > > > > causes a > >> >> > > > > > > > > > > > > > > > > > > > > process to get killed for > using > >> >> too > >> >> > > much > >> >> > > > > > > memory. > >> >> > > > > > > > > This > >> >> > > > > > > > > > > > could > >> >> > > > > > > > > > > > > > > then > >> >> > > > > > > > > > > > > > > > > > easily > >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar situation > >> what > >> >> we > >> >> > > have > >> >> > > > > with > >> >> > > > > > > the > >> >> > > > > > > > > > > > > > cutoff-ratio. > >> >> > > > > > > > > > > > > > > > So > >> >> > > > > > > > > > > > > > > > > > why > >> >> > > > > > > > > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane default value > >> for > >> >> max > >> >> > > > direct > >> >> > > > > > > > memory > >> >> > > > > > > > > > and > >> >> > > > > > > > > > > > > giving > >> >> > > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > > user > >> >> > > > > > > > > > > > > > > > > > > an > >> >> > > > > > > > > > > > > > > > > > > > > option to increase it if he > >> runs > >> >> into > >> >> > > an > >> >> > > > > OOM. > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would > >> alternative 2 > >> >> > lead > >> >> > > to > >> >> > > > > > lower > >> >> > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > utilization > >> >> > > > > > > > > > > > > > > > > > than > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set > the > >> >> direct > >> >> > > > > memory > >> >> > > > > > > to a > >> >> > > > > > > > > > > higher > >> >> > > > > > > > > > > > > > value? > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > >> >> > > > > > > > > > > > > > > > > > > > > Till > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 > AM > >> >> > Xintong > >> >> > > > > Song < > >> >> > > > > > > > > > > > > > > > tonysong...@gmail.com > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, > >> Yang. > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > >> >> > > > > > > > > > > > > > > > > > > > > > I think setting a very > large > >> max > >> >> > > direct > >> >> > > > > > > memory > >> >> > > > > > > > > size > >> >> > > > > > > > > > > > > > > definitely > >> >> > > > > > > > > > > > > > > > > has > >> >> > > > > > > > > > > > > > > > > > > some > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not > >> >> worry > >> >> > > about > >> >> > > > > > > direct > >> >> > > > > > > > > OOM, > >> >> > > > > > > > > > > and > >> >> > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > don't > >> >> > > > > > > > > > > > > > > > > > even > >> >> > > > > > > > > > > > > > > > > > > > > need > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / > network > >> >> > memory > >> >> > > > with > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > >> >> > > > > > > > > > > > > > > > > > > > > > However, there are also > some > >> >> down > >> >> > > sides > >> >> > > > > of > >> >> > > > > > > > doing > >> >> > > > > > > > > > > this. > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I can think > >> of is > >> >> > that > >> >> > > > if > >> >> > > > > a > >> >> > > > > > > task > >> >> > > > > > > > > > > > executor > >> >> > > > > > > > > > > > > > > > > container > >> >> > > > > > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > > > > > > killed due to overusing > >> >> memory, > >> >> > it > >> >> > > > > could > >> >> > > > > > > be > >> >> > > > > > > > > hard > >> >> > > > > > > > > > > for > >> >> > > > > > > > > > > > > use > >> >> > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > know > >> >> > > > > > > > > > > > > > > > > > > > which > >> >> > > > > > > > > > > > > > > > > > > > > > part > >> >> > > > > > > > > > > > > > > > > > > > > > of the memory is > overused. > >> >> > > > > > > > > > > > > > > > > > > > > > - Another down side is > >> that > >> >> the > >> >> > > JVM > >> >> > > > > > never > >> >> > > > > > > > > > trigger > >> >> > > > > > > > > > > GC > >> >> > > > > > > > > > > > > due > >> >> > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > > > reaching > >> >> > > > > > > > > > > > > > > > > > > > > max > >> >> > > > > > > > > > > > > > > > > > > > > > direct memory limit, > >> because > >> >> the > >> >> > > > limit > >> >> > > > > > is > >> >> > > > > > > > too > >> >> > > > > > > > > > high > >> >> > > > > > > > > > > > to > >> >> > > > > > > > > > > > > be > >> >> > > > > > > > > > > > > > > > > > reached. > >> >> > > > > > > > > > > > > > > > > > > > That > >> >> > > > > > > > > > > > > > > > > > > > > > means we kind of relay > on > >> >> heap > >> >> > > > memory > >> >> > > > > to > >> >> > > > > > > > > trigger > >> >> > > > > > > > > > > GC > >> >> > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > release > >> >> > > > > > > > > > > > > > > > > > > > direct > >> >> > > > > > > > > > > > > > > > > > > > > > memory. That could be a > >> >> problem > >> >> > in > >> >> > > > > cases > >> >> > > > > > > > where > >> >> > > > > > > > > > we > >> >> > > > > > > > > > > > have > >> >> > > > > > > > > > > > > > > more > >> >> > > > > > > > > > > > > > > > > > direct > >> >> > > > > > > > > > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > > > > > usage but not enough > heap > >> >> > activity > >> >> > > > to > >> >> > > > > > > > trigger > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > GC. > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your > >> reasons > >> >> > for > >> >> > > > > > > preferring > >> >> > > > > > > > > > > > setting a > >> >> > > > > > > > > > > > > > > very > >> >> > > > > > > > > > > > > > > > > > large > >> >> > > > > > > > > > > > > > > > > > > > > value, > >> >> > > > > > > > > > > > > > > > > > > > > > if there are anything else > I > >> >> > > > overlooked. > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > >> >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict > >> between > >> >> > > > multiple > >> >> > > > > > > > > > > configuration > >> >> > > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > user > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I > >> think we > >> >> > > should > >> >> > > > > > throw > >> >> > > > > > > > an > >> >> > > > > > > > > > > error. > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing checking on > the > >> >> > client > >> >> > > > side > >> >> > > > > > is > >> >> > > > > > > a > >> >> > > > > > > > > good > >> >> > > > > > > > > > > > idea, > >> >> > > > > > > > > > > > > > so > >> >> > > > > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > > > on > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the > >> problem > >> >> > > before > >> >> > > > > > > > submitting > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > > Flink > >> >> > > > > > > > > > > > > > > > > > cluster, > >> >> > > > > > > > > > > > > > > > > > > > > which > >> >> > > > > > > > > > > > > > > > > > > > > > is always a good thing. > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not only rely on > >> the > >> >> > > client > >> >> > > > > side > >> >> > > > > > > > > > checking, > >> >> > > > > > > > > > > > > > because > >> >> > > > > > > > > > > > > > > > for > >> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster > >> TaskManagers > >> >> on > >> >> > > > > > different > >> >> > > > > > > > > > machines > >> >> > > > > > > > > > > > may > >> >> > > > > > > > > > > > > > > have > >> >> > > > > > > > > > > > > > > > > > > > different > >> >> > > > > > > > > > > > > > > > > > > > > > configurations and the > client > >> >> does > >> >> > > see > >> >> > > > > > that. > >> >> > > > > > > > > > > > > > > > > > > > > > What do you think? > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 > >> PM > >> >> Yang > >> >> > > > Wang > >> >> > > > > < > >> >> > > > > > > > > > > > > > > > danrtsey...@gmail.com> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed > >> >> > proposal. > >> >> > > > > After > >> >> > > > > > > all > >> >> > > > > > > > > the > >> >> > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > > configuration > >> >> > > > > > > > > > > > > > > > > > > > > are > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be > more > >> >> > > powerful > >> >> > > > to > >> >> > > > > > > > control > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > > flink > >> >> > > > > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > >> >> > > > > > > > > > > > > > > > > > > > > > > just have few questions > >> about > >> >> it. > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native and Direct > >> Memory > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate > >> user > >> >> > direct > >> >> > > > > > memory > >> >> > > > > > > > and > >> >> > > > > > > > > > > native > >> >> > > > > > > > > > > > > > > memory. > >> >> > > > > > > > > > > > > > > > > > They > >> >> > > > > > > > > > > > > > > > > > > > are > >> >> > > > > > > > > > > > > > > > > > > > > > all > >> >> > > > > > > > > > > > > > > > > > > > > > > included in task off-heap > >> >> memory. > >> >> > > > > Right? > >> >> > > > > > > So i > >> >> > > > > > > > > > don’t > >> >> > > > > > > > > > > > > think > >> >> > > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > > > could > >> >> > > > > > > > > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > > > > > > > set > >> >> > > > > > > > > > > > > > > > > > > > > > > the > -XX:MaxDirectMemorySize > >> >> > > > properly. I > >> >> > > > > > > > prefer > >> >> > > > > > > > > > > > leaving > >> >> > > > > > > > > > > > > > it a > >> >> > > > > > > > > > > > > > > > > very > >> >> > > > > > > > > > > > > > > > > > > > large > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and > >> fine-grained > >> >> > > > > > > memory(network > >> >> > > > > > > > > > > memory, > >> >> > > > > > > > > > > > > > > managed > >> >> > > > > > > > > > > > > > > > > > > memory, > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than total > >> process > >> >> > > memory, > >> >> > > > > how > >> >> > > > > > do > >> >> > > > > > > > we > >> >> > > > > > > > > > deal > >> >> > > > > > > > > > > > > with > >> >> > > > > > > > > > > > > > > this > >> >> > > > > > > > > > > > > > > > > > > > > situation? > >> >> > > > > > > > > > > > > > > > > > > > > > Do > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to check the > memory > >> >> > > > > configuration > >> >> > > > > > > in > >> >> > > > > > > > > > > client? > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > >> >> > > tonysong...@gmail.com> > >> >> > > > > > > > > > 于2019年8月7日周三 > >> >> > > > > > > > > > > > > > > 下午10:14写道: > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would like to start > a > >> >> > > discussion > >> >> > > > > > > thread > >> >> > > > > > > > on > >> >> > > > > > > > > > > > > "FLIP-49: > >> >> > > > > > > > > > > > > > > > > Unified > >> >> > > > > > > > > > > > > > > > > > > > > Memory > >> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for > >> >> > > > TaskExecutors"[1], > >> >> > > > > > > where > >> >> > > > > > > > we > >> >> > > > > > > > > > > > > describe > >> >> > > > > > > > > > > > > > > how > >> >> > > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > > > > improve > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > >> >> > > configurations. > >> >> > > > > The > >> >> > > > > > > > FLIP > >> >> > > > > > > > > > > > document > >> >> > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > mostly > >> >> > > > > > > > > > > > > > > > > > > > based > >> >> > > > > > > > > > > > > > > > > > > > > > on > >> >> > > > > > > > > > > > > > > > > > > > > > > an > >> >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory > >> >> Management > >> >> > > and > >> >> > > > > > > > > > Configuration > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > >> >> > > > > > > > > > > > > > > > > > by > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates from > >> follow-up > >> >> > > > > discussions > >> >> > > > > > > > both > >> >> > > > > > > > > > > online > >> >> > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > > offline. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses > >> several > >> >> > > > > > shortcomings > >> >> > > > > > > of > >> >> > > > > > > > > > > current > >> >> > > > > > > > > > > > > > > (Flink > >> >> > > > > > > > > > > > > > > > > 1.9) > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > >> >> > > configuration. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Different > >> configuration > >> >> > for > >> >> > > > > > > Streaming > >> >> > > > > > > > > and > >> >> > > > > > > > > > > > Batch. > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and > >> difficult > >> >> > > > > > configuration > >> >> > > > > > > of > >> >> > > > > > > > > > > RocksDB > >> >> > > > > > > > > > > > > in > >> >> > > > > > > > > > > > > > > > > > Streaming. > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, > >> uncertain > >> >> and > >> >> > > > hard > >> >> > > > > to > >> >> > > > > > > > > > > understand. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve > the > >> >> > problems > >> >> > > > can > >> >> > > > > > be > >> >> > > > > > > > > > > summarized > >> >> > > > > > > > > > > > > as > >> >> > > > > > > > > > > > > > > > > follows. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory > >> manager > >> >> to > >> >> > > also > >> >> > > > > > > account > >> >> > > > > > > > > for > >> >> > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > usage > >> >> > > > > > > > > > > > > > > > > by > >> >> > > > > > > > > > > > > > > > > > > > state > >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how > >> TaskExecutor > >> >> > > memory > >> >> > > > > is > >> >> > > > > > > > > > > partitioned > >> >> > > > > > > > > > > > > > > > accounted > >> >> > > > > > > > > > > > > > > > > > > > > individual > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory reservations > >> and > >> >> > pools. > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory > >> >> > > configuration > >> >> > > > > > > options > >> >> > > > > > > > > and > >> >> > > > > > > > > > > > > > > calculations > >> >> > > > > > > > > > > > > > > > > > > logics. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find more > details > >> in > >> >> the > >> >> > > > FLIP > >> >> > > > > > wiki > >> >> > > > > > > > > > > document > >> >> > > > > > > > > > > > > [1]. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the > >> early > >> >> > > design > >> >> > > > > doc > >> >> > > > > > > [2] > >> >> > > > > > > > is > >> >> > > > > > > > > > out > >> >> > > > > > > > > > > > of > >> >> > > > > > > > > > > > > > > sync, > >> >> > > > > > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > > it > >> >> > > > > > > > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the > >> >> > > discussion > >> >> > > > in > >> >> > > > > > > this > >> >> > > > > > > > > > > mailing > >> >> > > > > > > > > > > > > list > >> >> > > > > > > > > > > > > > > > > > thread.) > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your > >> >> > > feedbacks. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong > Song > >> < > >> >> > > > > > > > > > tonysong...@gmail.com> > >> >> > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Thanks for sharing your opinion Till. > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was > >> >> > wondering > >> >> > > > > > whether > >> >> > > > > > > > we > >> >> > > > > > > > > > can > >> >> > > > > > > > > > > > > avoid > >> >> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap > >> managed > >> >> > memory > >> >> > > > and > >> >> > > > > > > > network > >> >> > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > with > >> >> > > > > > > > > > > > > > > alternative 3. But after giving it a > second > >> >> > > thought, > >> >> > > > I > >> >> > > > > > > think > >> >> > > > > > > > > even > >> >> > > > > > > > > > > for > >> >> > > > > > > > > > > > > > > alternative 3 using direct memory for > >> off-heap > >> >> > > > managed > >> >> > > > > > > memory > >> >> > > > > > > > > > could > >> >> > > > > > > > > > > > > cause > >> >> > > > > > > > > > > > > > > problems. > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Hi Yang, > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Regarding your concern, I think what > >> proposed > >> >> in > >> >> > > this > >> >> > > > > > FLIP > >> >> > > > > > > it > >> >> > > > > > > > > to > >> >> > > > > > > > > > > have > >> >> > > > > > > > > > > > > > both > >> >> > > > > > > > > > > > > > > off-heap managed memory and network > memory > >> >> > > allocated > >> >> > > > > > > through > >> >> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are > >> >> > practically > >> >> > > > > > native > >> >> > > > > > > > > memory > >> >> > > > > > > > > > > and > >> >> > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > limited by JVM max direct memory. The > only > >> >> parts > >> >> > of > >> >> > > > > > memory > >> >> > > > > > > > > > limited > >> >> > > > > > > > > > > by > >> >> > > > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > max direct memory are task off-heap > memory > >> and > >> >> > JVM > >> >> > > > > > > overhead, > >> >> > > > > > > > > > which > >> >> > > > > > > > > > > > are > >> >> > > > > > > > > > > > > > > exactly alternative 2 suggests to set the > >> JVM > >> >> max > >> >> > > > > direct > >> >> > > > > > > > memory > >> >> > > > > > > > > > to. > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till > >> Rohrmann > >> >> < > >> >> > > > > > > > > > > trohrm...@apache.org> > >> >> > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I > >> >> > > understand > >> >> > > > > the > >> >> > > > > > > two > >> >> > > > > > > > > > > > > alternatives > >> >> > > > > > > > > > > > > > > > now. > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > I would be in favour of option 2 > because > >> it > >> >> > makes > >> >> > > > > > things > >> >> > > > > > > > > > > explicit. > >> >> > > > > > > > > > > > If > >> >> > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > don't limit the direct memory, I fear > >> that > >> >> we > >> >> > > might > >> >> > > > > end > >> >> > > > > > > up > >> >> > > > > > > > > in a > >> >> > > > > > > > > > > > > similar > >> >> > > > > > > > > > > > > > > > situation as we are currently in: The > >> user > >> >> > might > >> >> > > > see > >> >> > > > > > that > >> >> > > > > > > > her > >> >> > > > > > > > > > > > process > >> >> > > > > > > > > > > > > > > gets > >> >> > > > > > > > > > > > > > > > killed by the OS and does not know why > >> this > >> >> is > >> >> > > the > >> >> > > > > > case. > >> >> > > > > > > > > > > > > Consequently, > >> >> > > > > > > > > > > > > > > she > >> >> > > > > > > > > > > > > > > > tries to decrease the process memory > size > >> >> > > (similar > >> >> > > > to > >> >> > > > > > > > > > increasing > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > cutoff > >> >> > > > > > > > > > > > > > > > ratio) in order to accommodate for the > >> extra > >> >> > > direct > >> >> > > > > > > memory. > >> >> > > > > > > > > > Even > >> >> > > > > > > > > > > > > worse, > >> >> > > > > > > > > > > > > > > she > >> >> > > > > > > > > > > > > > > > tries to decrease memory budgets which > >> are > >> >> not > >> >> > > > fully > >> >> > > > > > used > >> >> > > > > > > > and > >> >> > > > > > > > > > > hence > >> >> > > > > > > > > > > > > > won't > >> >> > > > > > > > > > > > > > > > change the overall memory consumption. > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > Cheers, > >> >> > > > > > > > > > > > > > > > Till > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM > Xintong > >> >> Song < > >> >> > > > > > > > > > > > tonysong...@gmail.com > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Let me explain this with a concrete > >> >> example > >> >> > > Till. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Let's say we have the following > >> scenario. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Total Process Memory: 1GB > >> >> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap > >> Memory + > >> >> JVM > >> >> > > > > > > Overhead): > >> >> > > > > > > > > > 200MB > >> >> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM > >> >> Metaspace, > >> >> > > > > > Off-Heap > >> >> > > > > > > > > > Managed > >> >> > > > > > > > > > > > > Memory > >> >> > > > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > Network Memory): 800MB > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > For alternative 2, we set > >> >> > > -XX:MaxDirectMemorySize > >> >> > > > > to > >> >> > > > > > > > 200MB. > >> >> > > > > > > > > > > > > > > > > For alternative 3, we set > >> >> > > -XX:MaxDirectMemorySize > >> >> > > > > to > >> >> > > > > > a > >> >> > > > > > > > very > >> >> > > > > > > > > > > large > >> >> > > > > > > > > > > > > > > value, > >> >> > > > > > > > > > > > > > > > > let's say 1TB. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage of > >> Task > >> >> > > > Off-Heap > >> >> > > > > > > Memory > >> >> > > > > > > > > and > >> >> > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > > Overhead > >> >> > > > > > > > > > > > > > > > > do not exceed 200MB, then > alternative 2 > >> >> and > >> >> > > > > > > alternative 3 > >> >> > > > > > > > > > > should > >> >> > > > > > > > > > > > > have > >> >> > > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > same utility. Setting larger > >> >> > > > > -XX:MaxDirectMemorySize > >> >> > > > > > > will > >> >> > > > > > > > > not > >> >> > > > > > > > > > > > > reduce > >> >> > > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > sizes of the other memory pools. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > If the actual direct memory usage of > >> Task > >> >> > > > Off-Heap > >> >> > > > > > > Memory > >> >> > > > > > > > > and > >> >> > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, > then > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > - Alternative 2 suffers from > >> frequent > >> >> OOM. > >> >> > > To > >> >> > > > > > avoid > >> >> > > > > > > > > that, > >> >> > > > > > > > > > > the > >> >> > > > > > > > > > > > > only > >> >> > > > > > > > > > > > > > > > thing > >> >> > > > > > > > > > > > > > > > > user can do is to modify the > >> >> configuration > >> >> > > and > >> >> > > > > > > > increase > >> >> > > > > > > > > > JVM > >> >> > > > > > > > > > > > > Direct > >> >> > > > > > > > > > > > > > > > > Memory > >> >> > > > > > > > > > > > > > > > > (Task Off-Heap Memory + JVM > >> Overhead). > >> >> > Let's > >> >> > > > say > >> >> > > > > > > that > >> >> > > > > > > > > user > >> >> > > > > > > > > > > > > > increases > >> >> > > > > > > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > > > Direct Memory to 250MB, this will > >> >> reduce > >> >> > the > >> >> > > > > total > >> >> > > > > > > > size > >> >> > > > > > > > > of > >> >> > > > > > > > > > > > other > >> >> > > > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > pools to 750MB, given the total > >> process > >> >> > > memory > >> >> > > > > > > remains > >> >> > > > > > > > > > 1GB. > >> >> > > > > > > > > > > > > > > > > - For alternative 3, there is no > >> >> chance of > >> >> > > > > direct > >> >> > > > > > > OOM. > >> >> > > > > > > > > > There > >> >> > > > > > > > > > > > are > >> >> > > > > > > > > > > > > > > > chances > >> >> > > > > > > > > > > > > > > > > of exceeding the total process > >> memory > >> >> > limit, > >> >> > > > but > >> >> > > > > > > given > >> >> > > > > > > > > > that > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > process > >> >> > > > > > > > > > > > > > > > > may > >> >> > > > > > > > > > > > > > > > > not use up all the reserved native > >> >> memory > >> >> > > > > > (Off-Heap > >> >> > > > > > > > > > Managed > >> >> > > > > > > > > > > > > > Memory, > >> >> > > > > > > > > > > > > > > > > Network > >> >> > > > > > > > > > > > > > > > > Memory, JVM Metaspace), if the > >> actual > >> >> > direct > >> >> > > > > > memory > >> >> > > > > > > > > usage > >> >> > > > > > > > > > is > >> >> > > > > > > > > > > > > > > slightly > >> >> > > > > > > > > > > > > > > > > above > >> >> > > > > > > > > > > > > > > > > yet very close to 200MB, user > >> probably > >> >> do > >> >> > > not > >> >> > > > > need > >> >> > > > > > > to > >> >> > > > > > > > > > change > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > configurations. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Therefore, I think from the user's > >> >> > > perspective, a > >> >> > > > > > > > feasible > >> >> > > > > > > > > > > > > > > configuration > >> >> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower > >> >> resource > >> >> > > > > > > utilization > >> >> > > > > > > > > > > compared > >> >> > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > alternative 3. > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till > >> >> > Rohrmann > >> >> > > < > >> >> > > > > > > > > > > > > trohrm...@apache.org > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > I guess you have to help me > >> understand > >> >> the > >> >> > > > > > difference > >> >> > > > > > > > > > between > >> >> > > > > > > > > > > > > > > > > alternative 2 > >> >> > > > > > > > > > > > > > > > > > and 3 wrt to memory under > utilization > >> >> > > Xintong. > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > - Alternative 2: set > >> >> XX:MaxDirectMemorySize > >> >> > > to > >> >> > > > > Task > >> >> > > > > > > > > > Off-Heap > >> >> > > > > > > > > > > > > Memory > >> >> > > > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > JVM > >> >> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk > that > >> >> this > >> >> > > size > >> >> > > > > is > >> >> > > > > > > too > >> >> > > > > > > > > low > >> >> > > > > > > > > > > > > > resulting > >> >> > > > > > > > > > > > > > > > in a > >> >> > > > > > > > > > > > > > > > > > lot of garbage collection and > >> >> potentially > >> >> > an > >> >> > > > OOM. > >> >> > > > > > > > > > > > > > > > > > - Alternative 3: set > >> >> XX:MaxDirectMemorySize > >> >> > > to > >> >> > > > > > > > something > >> >> > > > > > > > > > > larger > >> >> > > > > > > > > > > > > > than > >> >> > > > > > > > > > > > > > > > > > alternative 2. This would of course > >> >> reduce > >> >> > > the > >> >> > > > > > sizes > >> >> > > > > > > of > >> >> > > > > > > > > the > >> >> > > > > > > > > > > > other > >> >> > > > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > types. > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > How would alternative 2 now result > >> in an > >> >> > > under > >> >> > > > > > > > > utilization > >> >> > > > > > > > > > of > >> >> > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > compared to alternative 3? If > >> >> alternative 3 > >> >> > > > > > strictly > >> >> > > > > > > > > sets a > >> >> > > > > > > > > > > > > higher > >> >> > > > > > > > > > > > > > > max > >> >> > > > > > > > > > > > > > > > > > direct memory size and we use only > >> >> little, > >> >> > > > then I > >> >> > > > > > > would > >> >> > > > > > > > > > > expect > >> >> > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > > > alternative 3 results in memory > under > >> >> > > > > utilization. > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > Cheers, > >> >> > > > > > > > > > > > > > > > > > Till > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM > Yang > >> >> Wang < > >> >> > > > > > > > > > > > danrtsey...@gmail.com > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Hi xintong,till > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Native and Direct Memory > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > My point is setting a very large > >> max > >> >> > direct > >> >> > > > > > memory > >> >> > > > > > > > size > >> >> > > > > > > > > > > when > >> >> > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > do > >> >> > > > > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > > > > differentiate direct and native > >> >> memory. > >> >> > If > >> >> > > > the > >> >> > > > > > > direct > >> >> > > > > > > > > > > > > > > > memory,including > >> >> > > > > > > > > > > > > > > > > > user > >> >> > > > > > > > > > > > > > > > > > > direct memory and framework > direct > >> >> > > > memory,could > >> >> > > > > > be > >> >> > > > > > > > > > > calculated > >> >> > > > > > > > > > > > > > > > > > > correctly,then > >> >> > > > > > > > > > > > > > > > > > > i am in favor of setting direct > >> memory > >> >> > with > >> >> > > > > fixed > >> >> > > > > > > > > value. > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Memory Calculation > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn > and > >> >> k8s,we > >> >> > > > need > >> >> > > > > to > >> >> > > > > > > > check > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > > configurations in client to avoid > >> >> > > submitting > >> >> > > > > > > > > successfully > >> >> > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > failing > >> >> > > > > > > > > > > > > > > > > in > >> >> > > > > > > > > > > > > > > > > > > the flink master. > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Best, > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Yang > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > Xintong Song < > >> tonysong...@gmail.com > >> >> > > > > >于2019年8月13日 > >> >> > > > > > > > > > 周二22:07写道: > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till. > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think > you > >> are > >> >> > > right > >> >> > > > > that > >> >> > > > > > > we > >> >> > > > > > > > > > should > >> >> > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > include > >> >> > > > > > > > > > > > > > > > > > > this > >> >> > > > > > > > > > > > > > > > > > > > issue in the scope of this > FLIP. > >> >> This > >> >> > > FLIP > >> >> > > > > > should > >> >> > > > > > > > > > > > concentrate > >> >> > > > > > > > > > > > > > on > >> >> > > > > > > > > > > > > > > > how > >> >> > > > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > > > > configure memory pools for > >> >> > TaskExecutors, > >> >> > > > > with > >> >> > > > > > > > > minimum > >> >> > > > > > > > > > > > > > > involvement > >> >> > > > > > > > > > > > > > > > on > >> >> > > > > > > > > > > > > > > > > > how > >> >> > > > > > > > > > > > > > > > > > > > memory consumers use it. > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > About direct memory, I think > >> >> > alternative > >> >> > > 3 > >> >> > > > > may > >> >> > > > > > > not > >> >> > > > > > > > > > having > >> >> > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > same > >> >> > > > > > > > > > > > > > > > > over > >> >> > > > > > > > > > > > > > > > > > > > reservation issue that > >> alternative 2 > >> >> > > does, > >> >> > > > > but > >> >> > > > > > at > >> >> > > > > > > > the > >> >> > > > > > > > > > > cost > >> >> > > > > > > > > > > > of > >> >> > > > > > > > > > > > > > > risk > >> >> > > > > > > > > > > > > > > > of > >> >> > > > > > > > > > > > > > > > > > > over > >> >> > > > > > > > > > > > > > > > > > > > using memory at the container > >> level, > >> >> > > which > >> >> > > > is > >> >> > > > > > not > >> >> > > > > > > > > good. > >> >> > > > > > > > > > > My > >> >> > > > > > > > > > > > > > point > >> >> > > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and > >> "JVM > >> >> > > > > Overhead" > >> >> > > > > > > are > >> >> > > > > > > > > not > >> >> > > > > > > > > > > easy > >> >> > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > config. > >> >> > > > > > > > > > > > > > > > > > > For > >> >> > > > > > > > > > > > > > > > > > > > alternative 2, users might > >> configure > >> >> > them > >> >> > > > > > higher > >> >> > > > > > > > than > >> >> > > > > > > > > > > what > >> >> > > > > > > > > > > > > > > actually > >> >> > > > > > > > > > > > > > > > > > > needed, > >> >> > > > > > > > > > > > > > > > > > > > just to avoid getting a direct > >> OOM. > >> >> For > >> >> > > > > > > alternative > >> >> > > > > > > > > 3, > >> >> > > > > > > > > > > > users > >> >> > > > > > > > > > > > > do > >> >> > > > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > > get > >> >> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not > >> config > >> >> the > >> >> > > two > >> >> > > > > > > options > >> >> > > > > > > > > > > > > aggressively > >> >> > > > > > > > > > > > > > > > high. > >> >> > > > > > > > > > > > > > > > > > But > >> >> > > > > > > > > > > > > > > > > > > > the consequences are risks of > >> >> overall > >> >> > > > > container > >> >> > > > > > > > > memory > >> >> > > > > > > > > > > > usage > >> >> > > > > > > > > > > > > > > > exceeds > >> >> > > > > > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > > > > budget. > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM > >> Till > >> >> > > > > Rohrmann < > >> >> > > > > > > > > > > > > > > > trohrm...@apache.org> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this > FLIP > >> >> > Xintong. > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > All in all I think it already > >> >> looks > >> >> > > quite > >> >> > > > > > good. > >> >> > > > > > > > > > > > Concerning > >> >> > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > first > >> >> > > > > > > > > > > > > > > > > > > open > >> >> > > > > > > > > > > > > > > > > > > > > question about allocating > >> memory > >> >> > > > segments, > >> >> > > > > I > >> >> > > > > > > was > >> >> > > > > > > > > > > > wondering > >> >> > > > > > > > > > > > > > > > whether > >> >> > > > > > > > > > > > > > > > > > this > >> >> > > > > > > > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in > the > >> >> > context > >> >> > > > of > >> >> > > > > > this > >> >> > > > > > > > > FLIP > >> >> > > > > > > > > > or > >> >> > > > > > > > > > > > > > whether > >> >> > > > > > > > > > > > > > > > > this > >> >> > > > > > > > > > > > > > > > > > > > could > >> >> > > > > > > > > > > > > > > > > > > > > be done as a follow up? > Without > >> >> > knowing > >> >> > > > all > >> >> > > > > > > > > details, > >> >> > > > > > > > > > I > >> >> > > > > > > > > > > > > would > >> >> > > > > > > > > > > > > > be > >> >> > > > > > > > > > > > > > > > > > > concerned > >> >> > > > > > > > > > > > > > > > > > > > > that we would widen the scope > >> of > >> >> this > >> >> > > > FLIP > >> >> > > > > > too > >> >> > > > > > > > much > >> >> > > > > > > > > > > > because > >> >> > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > > would > >> >> > > > > > > > > > > > > > > > > > > have > >> >> > > > > > > > > > > > > > > > > > > > > to touch all the existing > call > >> >> sites > >> >> > of > >> >> > > > the > >> >> > > > > > > > > > > MemoryManager > >> >> > > > > > > > > > > > > > where > >> >> > > > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > > > > > allocate > >> >> > > > > > > > > > > > > > > > > > > > > memory segments (this should > >> >> mainly > >> >> > be > >> >> > > > > batch > >> >> > > > > > > > > > > operators). > >> >> > > > > > > > > > > > > The > >> >> > > > > > > > > > > > > > > > > addition > >> >> > > > > > > > > > > > > > > > > > > of > >> >> > > > > > > > > > > > > > > > > > > > > the memory reservation call > to > >> the > >> >> > > > > > > MemoryManager > >> >> > > > > > > > > > should > >> >> > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > be > >> >> > > > > > > > > > > > > > > > > > affected > >> >> > > > > > > > > > > > > > > > > > > > by > >> >> > > > > > > > > > > > > > > > > > > > > this and I would hope that > >> this is > >> >> > the > >> >> > > > only > >> >> > > > > > > point > >> >> > > > > > > > > of > >> >> > > > > > > > > > > > > > > interaction > >> >> > > > > > > > > > > > > > > > a > >> >> > > > > > > > > > > > > > > > > > > > > streaming job would have with > >> the > >> >> > > > > > > MemoryManager. > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Concerning the second open > >> >> question > >> >> > > about > >> >> > > > > > > setting > >> >> > > > > > > > > or > >> >> > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > setting > >> >> > > > > > > > > > > > > > > > a > >> >> > > > > > > > > > > > > > > > > > max > >> >> > > > > > > > > > > > > > > > > > > > > direct memory limit, I would > >> also > >> >> be > >> >> > > > > > interested > >> >> > > > > > > > why > >> >> > > > > > > > > > > Yang > >> >> > > > > > > > > > > > > Wang > >> >> > > > > > > > > > > > > > > > > thinks > >> >> > > > > > > > > > > > > > > > > > > > > leaving it open would be > best. > >> My > >> >> > > concern > >> >> > > > > > about > >> >> > > > > > > > > this > >> >> > > > > > > > > > > > would > >> >> > > > > > > > > > > > > be > >> >> > > > > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > > > > > would > >> >> > > > > > > > > > > > > > > > > > > > > be in a similar situation as > we > >> >> are > >> >> > now > >> >> > > > > with > >> >> > > > > > > the > >> >> > > > > > > > > > > > > > > > > RocksDBStateBackend. > >> >> > > > > > > > > > > > > > > > > > > If > >> >> > > > > > > > > > > > > > > > > > > > > the different memory pools > are > >> not > >> >> > > > clearly > >> >> > > > > > > > > separated > >> >> > > > > > > > > > > and > >> >> > > > > > > > > > > > > can > >> >> > > > > > > > > > > > > > > > spill > >> >> > > > > > > > > > > > > > > > > > over > >> >> > > > > > > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > > > > > a different pool, then it is > >> quite > >> >> > hard > >> >> > > > to > >> >> > > > > > > > > understand > >> >> > > > > > > > > > > > what > >> >> > > > > > > > > > > > > > > > exactly > >> >> > > > > > > > > > > > > > > > > > > > causes a > >> >> > > > > > > > > > > > > > > > > > > > > process to get killed for > using > >> >> too > >> >> > > much > >> >> > > > > > > memory. > >> >> > > > > > > > > This > >> >> > > > > > > > > > > > could > >> >> > > > > > > > > > > > > > > then > >> >> > > > > > > > > > > > > > > > > > easily > >> >> > > > > > > > > > > > > > > > > > > > > lead to a similar situation > >> what > >> >> we > >> >> > > have > >> >> > > > > with > >> >> > > > > > > the > >> >> > > > > > > > > > > > > > cutoff-ratio. > >> >> > > > > > > > > > > > > > > > So > >> >> > > > > > > > > > > > > > > > > > why > >> >> > > > > > > > > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > > > > > > setting a sane default value > >> for > >> >> max > >> >> > > > direct > >> >> > > > > > > > memory > >> >> > > > > > > > > > and > >> >> > > > > > > > > > > > > giving > >> >> > > > > > > > > > > > > > > the > >> >> > > > > > > > > > > > > > > > > > user > >> >> > > > > > > > > > > > > > > > > > > an > >> >> > > > > > > > > > > > > > > > > > > > > option to increase it if he > >> runs > >> >> into > >> >> > > an > >> >> > > > > OOM. > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > @Xintong, how would > >> alternative 2 > >> >> > lead > >> >> > > to > >> >> > > > > > lower > >> >> > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > utilization > >> >> > > > > > > > > > > > > > > > > > than > >> >> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set > the > >> >> direct > >> >> > > > > memory > >> >> > > > > > > to a > >> >> > > > > > > > > > > higher > >> >> > > > > > > > > > > > > > value? > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > Cheers, > >> >> > > > > > > > > > > > > > > > > > > > > Till > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 > AM > >> >> > Xintong > >> >> > > > > Song < > >> >> > > > > > > > > > > > > > > > tonysong...@gmail.com > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, > >> Yang. > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Regarding your comments: > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory* > >> >> > > > > > > > > > > > > > > > > > > > > > I think setting a very > large > >> max > >> >> > > direct > >> >> > > > > > > memory > >> >> > > > > > > > > size > >> >> > > > > > > > > > > > > > > definitely > >> >> > > > > > > > > > > > > > > > > has > >> >> > > > > > > > > > > > > > > > > > > some > >> >> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not > >> >> worry > >> >> > > about > >> >> > > > > > > direct > >> >> > > > > > > > > OOM, > >> >> > > > > > > > > > > and > >> >> > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > don't > >> >> > > > > > > > > > > > > > > > > > even > >> >> > > > > > > > > > > > > > > > > > > > > need > >> >> > > > > > > > > > > > > > > > > > > > > > to allocate managed / > network > >> >> > memory > >> >> > > > with > >> >> > > > > > > > > > > > > > Unsafe.allocate() . > >> >> > > > > > > > > > > > > > > > > > > > > > However, there are also > some > >> >> down > >> >> > > sides > >> >> > > > > of > >> >> > > > > > > > doing > >> >> > > > > > > > > > > this. > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > - One thing I can think > >> of is > >> >> > that > >> >> > > > if > >> >> > > > > a > >> >> > > > > > > task > >> >> > > > > > > > > > > > executor > >> >> > > > > > > > > > > > > > > > > container > >> >> > > > > > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > > > > > > killed due to overusing > >> >> memory, > >> >> > it > >> >> > > > > could > >> >> > > > > > > be > >> >> > > > > > > > > hard > >> >> > > > > > > > > > > for > >> >> > > > > > > > > > > > > use > >> >> > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > know > >> >> > > > > > > > > > > > > > > > > > > > which > >> >> > > > > > > > > > > > > > > > > > > > > > part > >> >> > > > > > > > > > > > > > > > > > > > > > of the memory is > overused. > >> >> > > > > > > > > > > > > > > > > > > > > > - Another down side is > >> that > >> >> the > >> >> > > JVM > >> >> > > > > > never > >> >> > > > > > > > > > trigger > >> >> > > > > > > > > > > GC > >> >> > > > > > > > > > > > > due > >> >> > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > > > reaching > >> >> > > > > > > > > > > > > > > > > > > > > max > >> >> > > > > > > > > > > > > > > > > > > > > > direct memory limit, > >> because > >> >> the > >> >> > > > limit > >> >> > > > > > is > >> >> > > > > > > > too > >> >> > > > > > > > > > high > >> >> > > > > > > > > > > > to > >> >> > > > > > > > > > > > > be > >> >> > > > > > > > > > > > > > > > > > reached. > >> >> > > > > > > > > > > > > > > > > > > > That > >> >> > > > > > > > > > > > > > > > > > > > > > means we kind of relay > on > >> >> heap > >> >> > > > memory > >> >> > > > > to > >> >> > > > > > > > > trigger > >> >> > > > > > > > > > > GC > >> >> > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > release > >> >> > > > > > > > > > > > > > > > > > > > direct > >> >> > > > > > > > > > > > > > > > > > > > > > memory. That could be a > >> >> problem > >> >> > in > >> >> > > > > cases > >> >> > > > > > > > where > >> >> > > > > > > > > > we > >> >> > > > > > > > > > > > have > >> >> > > > > > > > > > > > > > > more > >> >> > > > > > > > > > > > > > > > > > direct > >> >> > > > > > > > > > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > > > > > usage but not enough > heap > >> >> > activity > >> >> > > > to > >> >> > > > > > > > trigger > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > GC. > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your > >> reasons > >> >> > for > >> >> > > > > > > preferring > >> >> > > > > > > > > > > > setting a > >> >> > > > > > > > > > > > > > > very > >> >> > > > > > > > > > > > > > > > > > large > >> >> > > > > > > > > > > > > > > > > > > > > value, > >> >> > > > > > > > > > > > > > > > > > > > > > if there are anything else > I > >> >> > > > overlooked. > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation* > >> >> > > > > > > > > > > > > > > > > > > > > > If there is any conflict > >> between > >> >> > > > multiple > >> >> > > > > > > > > > > configuration > >> >> > > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > user > >> >> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I > >> think we > >> >> > > should > >> >> > > > > > throw > >> >> > > > > > > > an > >> >> > > > > > > > > > > error. > >> >> > > > > > > > > > > > > > > > > > > > > > I think doing checking on > the > >> >> > client > >> >> > > > side > >> >> > > > > > is > >> >> > > > > > > a > >> >> > > > > > > > > good > >> >> > > > > > > > > > > > idea, > >> >> > > > > > > > > > > > > > so > >> >> > > > > > > > > > > > > > > > that > >> >> > > > > > > > > > > > > > > > > > on > >> >> > > > > > > > > > > > > > > > > > > > > Yarn / > >> >> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the > >> problem > >> >> > > before > >> >> > > > > > > > submitting > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > > Flink > >> >> > > > > > > > > > > > > > > > > > cluster, > >> >> > > > > > > > > > > > > > > > > > > > > which > >> >> > > > > > > > > > > > > > > > > > > > > > is always a good thing. > >> >> > > > > > > > > > > > > > > > > > > > > > But we can not only rely on > >> the > >> >> > > client > >> >> > > > > side > >> >> > > > > > > > > > checking, > >> >> > > > > > > > > > > > > > because > >> >> > > > > > > > > > > > > > > > for > >> >> > > > > > > > > > > > > > > > > > > > > > standalone cluster > >> TaskManagers > >> >> on > >> >> > > > > > different > >> >> > > > > > > > > > machines > >> >> > > > > > > > > > > > may > >> >> > > > > > > > > > > > > > > have > >> >> > > > > > > > > > > > > > > > > > > > different > >> >> > > > > > > > > > > > > > > > > > > > > > configurations and the > client > >> >> does > >> >> > > see > >> >> > > > > > that. > >> >> > > > > > > > > > > > > > > > > > > > > > What do you think? > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 > >> PM > >> >> Yang > >> >> > > > Wang > >> >> > > > > < > >> >> > > > > > > > > > > > > > > > danrtsey...@gmail.com> > >> >> > > > > > > > > > > > > > > > > > > > wrote: > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Hi xintong, > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed > >> >> > proposal. > >> >> > > > > After > >> >> > > > > > > all > >> >> > > > > > > > > the > >> >> > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > > configuration > >> >> > > > > > > > > > > > > > > > > > > > > are > >> >> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be > more > >> >> > > powerful > >> >> > > > to > >> >> > > > > > > > control > >> >> > > > > > > > > > the > >> >> > > > > > > > > > > > > flink > >> >> > > > > > > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > > > > > > > usage. I > >> >> > > > > > > > > > > > > > > > > > > > > > > just have few questions > >> about > >> >> it. > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > - Native and Direct > >> Memory > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate > >> user > >> >> > direct > >> >> > > > > > memory > >> >> > > > > > > > and > >> >> > > > > > > > > > > native > >> >> > > > > > > > > > > > > > > memory. > >> >> > > > > > > > > > > > > > > > > > They > >> >> > > > > > > > > > > > > > > > > > > > are > >> >> > > > > > > > > > > > > > > > > > > > > > all > >> >> > > > > > > > > > > > > > > > > > > > > > > included in task off-heap > >> >> memory. > >> >> > > > > Right? > >> >> > > > > > > So i > >> >> > > > > > > > > > don’t > >> >> > > > > > > > > > > > > think > >> >> > > > > > > > > > > > > > > we > >> >> > > > > > > > > > > > > > > > > > could > >> >> > > > > > > > > > > > > > > > > > > > not > >> >> > > > > > > > > > > > > > > > > > > > > > set > >> >> > > > > > > > > > > > > > > > > > > > > > > the > -XX:MaxDirectMemorySize > >> >> > > > properly. I > >> >> > > > > > > > prefer > >> >> > > > > > > > > > > > leaving > >> >> > > > > > > > > > > > > > it a > >> >> > > > > > > > > > > > > > > > > very > >> >> > > > > > > > > > > > > > > > > > > > large > >> >> > > > > > > > > > > > > > > > > > > > > > > value. > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > - Memory Calculation > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > If the sum of and > >> fine-grained > >> >> > > > > > > memory(network > >> >> > > > > > > > > > > memory, > >> >> > > > > > > > > > > > > > > managed > >> >> > > > > > > > > > > > > > > > > > > memory, > >> >> > > > > > > > > > > > > > > > > > > > > > etc.) > >> >> > > > > > > > > > > > > > > > > > > > > > > is larger than total > >> process > >> >> > > memory, > >> >> > > > > how > >> >> > > > > > do > >> >> > > > > > > > we > >> >> > > > > > > > > > deal > >> >> > > > > > > > > > > > > with > >> >> > > > > > > > > > > > > > > this > >> >> > > > > > > > > > > > > > > > > > > > > situation? > >> >> > > > > > > > > > > > > > > > > > > > > > Do > >> >> > > > > > > > > > > > > > > > > > > > > > > we need to check the > memory > >> >> > > > > configuration > >> >> > > > > > > in > >> >> > > > > > > > > > > client? > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > Xintong Song < > >> >> > > tonysong...@gmail.com> > >> >> > > > > > > > > > 于2019年8月7日周三 > >> >> > > > > > > > > > > > > > > 下午10:14写道: > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > We would like to start > a > >> >> > > discussion > >> >> > > > > > > thread > >> >> > > > > > > > on > >> >> > > > > > > > > > > > > "FLIP-49: > >> >> > > > > > > > > > > > > > > > > Unified > >> >> > > > > > > > > > > > > > > > > > > > > Memory > >> >> > > > > > > > > > > > > > > > > > > > > > > > Configuration for > >> >> > > > TaskExecutors"[1], > >> >> > > > > > > where > >> >> > > > > > > > we > >> >> > > > > > > > > > > > > describe > >> >> > > > > > > > > > > > > > > how > >> >> > > > > > > > > > > > > > > > to > >> >> > > > > > > > > > > > > > > > > > > > improve > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > >> >> > > configurations. > >> >> > > > > The > >> >> > > > > > > > FLIP > >> >> > > > > > > > > > > > document > >> >> > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > mostly > >> >> > > > > > > > > > > > > > > > > > > > based > >> >> > > > > > > > > > > > > > > > > > > > > > on > >> >> > > > > > > > > > > > > > > > > > > > > > > an > >> >> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory > >> >> Management > >> >> > > and > >> >> > > > > > > > > > Configuration > >> >> > > > > > > > > > > > > > > > > Reloaded"[2] > >> >> > > > > > > > > > > > > > > > > > by > >> >> > > > > > > > > > > > > > > > > > > > > > > Stephan, > >> >> > > > > > > > > > > > > > > > > > > > > > > > with updates from > >> follow-up > >> >> > > > > discussions > >> >> > > > > > > > both > >> >> > > > > > > > > > > online > >> >> > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > > offline. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses > >> several > >> >> > > > > > shortcomings > >> >> > > > > > > of > >> >> > > > > > > > > > > current > >> >> > > > > > > > > > > > > > > (Flink > >> >> > > > > > > > > > > > > > > > > 1.9) > >> >> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory > >> >> > > configuration. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Different > >> configuration > >> >> > for > >> >> > > > > > > Streaming > >> >> > > > > > > > > and > >> >> > > > > > > > > > > > Batch. > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complex and > >> difficult > >> >> > > > > > configuration > >> >> > > > > > > of > >> >> > > > > > > > > > > RocksDB > >> >> > > > > > > > > > > > > in > >> >> > > > > > > > > > > > > > > > > > Streaming. > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Complicated, > >> uncertain > >> >> and > >> >> > > > hard > >> >> > > > > to > >> >> > > > > > > > > > > understand. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve > the > >> >> > problems > >> >> > > > can > >> >> > > > > > be > >> >> > > > > > > > > > > summarized > >> >> > > > > > > > > > > > > as > >> >> > > > > > > > > > > > > > > > > follows. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Extend memory > >> manager > >> >> to > >> >> > > also > >> >> > > > > > > account > >> >> > > > > > > > > for > >> >> > > > > > > > > > > > memory > >> >> > > > > > > > > > > > > > > usage > >> >> > > > > > > > > > > > > > > > > by > >> >> > > > > > > > > > > > > > > > > > > > state > >> >> > > > > > > > > > > > > > > > > > > > > > > > backends. > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Modify how > >> TaskExecutor > >> >> > > memory > >> >> > > > > is > >> >> > > > > > > > > > > partitioned > >> >> > > > > > > > > > > > > > > > accounted > >> >> > > > > > > > > > > > > > > > > > > > > individual > >> >> > > > > > > > > > > > > > > > > > > > > > > > memory reservations > >> and > >> >> > pools. > >> >> > > > > > > > > > > > > > > > > > > > > > > > - Simplify memory > >> >> > > configuration > >> >> > > > > > > options > >> >> > > > > > > > > and > >> >> > > > > > > > > > > > > > > calculations > >> >> > > > > > > > > > > > > > > > > > > logics. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Please find more > details > >> in > >> >> the > >> >> > > > FLIP > >> >> > > > > > wiki > >> >> > > > > > > > > > > document > >> >> > > > > > > > > > > > > [1]. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the > >> early > >> >> > > design > >> >> > > > > doc > >> >> > > > > > > [2] > >> >> > > > > > > > is > >> >> > > > > > > > > > out > >> >> > > > > > > > > > > > of > >> >> > > > > > > > > > > > > > > sync, > >> >> > > > > > > > > > > > > > > > > and > >> >> > > > > > > > > > > > > > > > > > it > >> >> > > > > > > > > > > > > > > > > > > > is > >> >> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the > >> >> > > discussion > >> >> > > > in > >> >> > > > > > > this > >> >> > > > > > > > > > > mailing > >> >> > > > > > > > > > > > > list > >> >> > > > > > > > > > > > > > > > > > thread.) > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your > >> >> > > feedbacks. > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > [1] > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > [2] > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing > >> >> > > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > > >> >> > > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > > >> > > >