[ https://issues.apache.org/jira/browse/FLINK-20863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthias updated FLINK-20863: ----------------------------- Component/s: Runtime / Network > Exclude network memory from ResourceProfile > ------------------------------------------- > > Key: FLINK-20863 > URL: https://issues.apache.org/jira/browse/FLINK-20863 > Project: Flink > Issue Type: Task > Components: Runtime / Network > Reporter: Yangze Guo > Priority: Major > Fix For: 1.13.0 > > > Network memory is included in the current ResourceProfile implementation, > expecting the fine-grained resource management to not deploy too many tasks > onto a TM that require more network memory than the TM contains. > However, how much network memory each task needs highly depends on the > shuffle service implementation, and may vary when switching to another > shuffle service. Therefore, neither user nor the Flink runtime can easily > specify network memory requirements for a task/slot at the moment. > The concrete solution for network memory controlling is beyond the scope of > this FLIP. However, we are aware of a few potential directions for solving > this problem. > - Make shuffle services adaptively control the amount of memory assigned to > each task/slot, with respect to the given memory pool size. In this way, > there should be no need to rely on fine-grained resource management to > control the network memory consumption. > - Make shuffle services expose interfaces for calculating network memory > requirements for given SSGs. In this way, the Flink runtime can specify the > calculated network memory requirements for slots, without having to > understand the internal details of different shuffle service implementations. > As for now, we propose to exclude network memory from ResourceProfile for the > moment, to unblock the fine-grained resource management feature from the > network memory controlling issue. If needed, it can be added back in future, > as long as there’s a good way to specify the requirement. -- This message was sent by Atlassian Jira (v8.3.4#803005)