[ https://issues.apache.org/jira/browse/FLINK-18554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157258#comment-17157258 ]
Truong Duc Kien commented on FLINK-18554: ----------------------------------------- We investigated and came to the conclusion that it's almost impossible to account for mmap's memory without CGroup: Linux will use as much memory as allowed to mmap file. As a result, we've decided to just turn mmap_read off when running on YARN without CGroup. Turning on mmap_read significantly lowers the read latency when running Rocksb on ram disk, but our jobs can run fine without it, so not a big loss. Since mmap_read defaults to OFF, we also think it's fine for Flink to ignore this edge case. > Memory exceeds taskmanager.memory.process.size when enabling mmap_read for > RocksDB > ---------------------------------------------------------------------------------- > > Key: FLINK-18554 > URL: https://issues.apache.org/jira/browse/FLINK-18554 > Project: Flink > Issue Type: Bug > Components: Runtime / Configuration > Affects Versions: 1.11.0 > Reporter: Truong Duc Kien > Priority: Major > > We are testing Flink automatic memory management feature on Flink 1.11. > However, YARN kept killing our containers due to the processes' physical > memory exceeds the limit, although we have tuned the following configurations: > {code:java} > taskmanager.memory.process.size > taskmanager.memory.managed.fraction > {code} > We suspect that it's because we have enabled mmap_read for RocksDB, since > turning this options off seems to fix the issue. Maybe Flink automatic memory > management is unable to account for the addition memory required when using > mmap_read ? > -- This message was sent by Atlassian Jira (v8.3.4#803005)