[ 
https://issues.apache.org/jira/browse/FLINK-20863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias updated FLINK-20863:
-----------------------------
    Component/s: Runtime / Network

> Exclude network memory from ResourceProfile
> -------------------------------------------
>
>                 Key: FLINK-20863
>                 URL: https://issues.apache.org/jira/browse/FLINK-20863
>             Project: Flink
>          Issue Type: Task
>          Components: Runtime / Network
>            Reporter: Yangze Guo
>            Priority: Major
>             Fix For: 1.13.0
>
>
> Network memory is included in the current ResourceProfile implementation, 
> expecting the fine-grained resource management to not deploy too many tasks 
> onto a TM that require more network memory than the TM contains.
> However, how much network memory each task needs highly depends on the 
> shuffle service implementation, and may vary when switching to another 
> shuffle service. Therefore, neither user nor the Flink runtime can easily 
> specify network memory requirements for a task/slot at the moment.
> The concrete solution for network memory controlling is beyond the scope of 
> this FLIP. However, we are aware of a few potential directions for solving 
> this problem.
> - Make shuffle services adaptively control the amount of memory assigned to 
> each task/slot, with respect to the given memory pool size. In this way, 
> there should be no need to rely on fine-grained resource management to 
> control the network memory consumption.
> - Make shuffle services expose interfaces for calculating network memory 
> requirements for given SSGs. In this way, the Flink runtime can specify the 
> calculated network memory requirements for slots, without having to 
> understand the internal details of different shuffle service implementations.
> As for now, we propose to exclude network memory from ResourceProfile for the 
> moment, to unblock the fine-grained resource management feature from the 
> network memory controlling issue. If needed, it can be added back in future, 
> as long as there’s a good way to specify the requirement.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to