[ https://issues.apache.org/jira/browse/FLINK-29985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Roman Khachatryan updated FLINK-29985: -------------------------------------- Description: When TM is stopped by RM, its slot table is closed, causing all its slots to be released. However, when TM is stopped by SIGTERM (i.e. external resource manager), its slot table is NOT closed. When a slot is released, the associated resources are released as well, in particular, MemoryManager. MemoryManager might hold not only memory, but also arbitrary shared resources (currently, PythonSharedResources and RocksDBSharedResources). As of now, RocksDBSharedResources contains only ephemeral resources. Not sure about PythonSharedResources, but likely it is associated with a separate process. That means that in standalone clusters, some resources might not be released. was: When a slot is released, the associated resources are released as well, in particular, MemoryManager. MemoryManager might hold not only memory, but also some arbitrary shared resources (currently, PythonSharedResources and RocksDBSharedResources). When TM is stopped by JManager, its slot table is closed, causing all its slot to be released When TM is stopped by SIGTERM (i.e. external resource manager), its slot table is NOT closed. That means that in standalone clusters, some resources might not be released. As of now, RocksDBSharedResources contains only ephemeral resources. Not sure about PythonSharedResources, but likely it is associated with a separate process. > TaskManager doesn't close SlotTable on SIGTERM > ---------------------------------------------- > > Key: FLINK-29985 > URL: https://issues.apache.org/jira/browse/FLINK-29985 > Project: Flink > Issue Type: Bug > Components: Runtime / Task > Affects Versions: 1.16.0, 1.15.3 > Reporter: Roman Khachatryan > Priority: Major > > When TM is stopped by RM, its slot table is closed, causing all its slots to > be released. > However, when TM is stopped by SIGTERM (i.e. external resource manager), its > slot table is NOT closed. > > When a slot is released, the associated resources are released as well, in > particular, MemoryManager. > MemoryManager might hold not only memory, but also arbitrary shared resources > (currently, PythonSharedResources and RocksDBSharedResources). > As of now, RocksDBSharedResources contains only ephemeral resources. Not sure > about PythonSharedResources, but likely it is associated with a separate > process. > That means that in standalone clusters, some resources might not be released. -- This message was sent by Atlassian Jira (v8.20.10#820010)