Hi all, Thanks for the feedback, Xintong and Till.
> rename jobmanager.memory.direct.size into jobmanager.memory.off-heap.size I am ok with that to align it with TM and avoid further complications for users. I will adjust the FLIP. > change the default value of JM Metaspace size to 256 MB Indeed, no reason to assume that the user code would need less Metaspace in JM. I will change it unless a better argument is reported for another value. I think all concerns has been resolved so I am starting the voting in a separate thread. Best, Andrey On Tue, Mar 17, 2020 at 6:16 PM Till Rohrmann <trohrm...@apache.org> wrote: > Thanks for creating this FLIP Andrey. > > I agree with Xintong that we should rename jobmanager.memory.direct.size > into jobmanager.memory.off-heap.size which accounts for native and direct > memory usage. I think it should be good enough and is easier to understand > for the user. > > Concerning the default value for the metaspace size. Did we take the > lessons learned from the TM metaspace size into account? IIRC we are about > to change the default value to 256 MB. > > Feel free to start a vote once these last two questions have been resolved. > > Cheers, > Till > > On Thu, Mar 12, 2020 at 4:25 AM Xintong Song <tonysong...@gmail.com> > wrote: > > > Thanks Andrey for kicking this discussion off. > > > > Regarding "direct" vs. "off-heap", I'm personally in favor of renaming > the > > "direct" memory in the current FLIP-116[1] to "off-heap" memory, and > making > > it also account for user native memory usage. > > > > On one hand, I think it would be good that JM & TM provide consistent > > concepts and terminologies to users. IIUC, this is exactly the purpose of > > this FLIP. For TMs, we already have "off-heap" memory accounting for both > > direct and native memory usages, and we did this so that users do not > need > > to understand the differences between the two kinds. > > > > On the other hand, while for TMs it is hard to tell which kind of memory > is > > needed mostly due to variety of applications, I believe for JM the major > > memory consumption is heap memory in most cases. That means we probably > can > > rely on the heap activities to trigger GC in most cases, and the max > direct > > memory limit can act as a safe net. Moreover, I think the cases should be > > very rare that we need native memory for user codes. Therefore, we > probably > > should not break the JM/TM consistency for potential risks in such rare > > cases. > > > > WDYT? > > > > Thank you~ > > > > Xintong Song > > > > > > [1] > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP+116%3A+Unified+Memory+Configuration+for+Job+Managers > > > > On Wed, Mar 11, 2020 at 8:56 PM Andrey Zagrebin <azagre...@apache.org> > > wrote: > > > > > Hi All, > > > > > > As you may have noticed, 1.10 release included an extensive > improvements > > to > > > memory management and configuration of Task Managers, FLIP-49: [1]. The > > > memory configuration of Job Managers has not been touched in 1.10. > > > > > > Although, Job Manager's memory model does not look so sophisticated as > > > for Task Managers, It makes to align Job Manager memory model and > > settings > > > with Task Managers. Therefore, we propose to reconsider it as well in > > 1.11 > > > and I prepared a FLIP 116 [2] for that. > > > > > > Any feedback is appreciated. > > > > > > So far, there is one discussion point about how to address native > > > non-direct memory usage of user code. The user code can be run e.g. in > > > certain job submission scenarios within the JM process. For simplicity, > > > FLIP suggests only an option for direct memory which is translated into > > the > > > setting of the JVM direct memory limit. > > > Although, we documented for TM that the similar parameters can also > > > address native non-direct memory usage [3], this can lead to wrong > > > functioning of the JVM direct memory limit. The direct memory option in > > JM > > > could be also named in more general way, e.g. off-heap memory but this > > > naming would somewhat hide its nature of JVM direct memory limit. > > > On the other hand, JVM Overhead does not suffer from this problem and > > > affects only the container/worker memory size which is the most > important > > > matter to address for the native non-direct memory consumption. The > > caveat > > > here is that JVM Overhead was not supposed to be used by any Flink or > > user > > > components. > > > > > > Thanks, > > > Andrey > > > > > > [1] > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors > > > [2] > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP+116%3A+Unified+Memory+Configuration+for+Job+Managers > > > [3] > > > > > > > > > https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_detail.html#overview > > > > > >