Yuxin Tan created FLINK-28942: --------------------------------- Summary: Deadlock may occurs when releasing readers for SortMergeResultPartition Key: FLINK-28942 URL: https://issues.apache.org/jira/browse/FLINK-28942 Project: Flink Issue Type: Bug Affects Versions: 1.16.0 Reporter: Yuxin Tan
After adding the logic of recycling buffers in CompositeBuffer in https://issues.apache.org/jira/browse/FLINK-28373, when reading data and recycling buffers simultaneously, the deadlock between the lock of SortMergeResultPartition and the lock of SingleInputGate may occur. In short, the deadlock may occur as follows. 1. SingleInputGate.getNextBufferOrEvent (SingleInputGate lock) CompositeBuffer.getFullBufferData -> CompositeBuffer.recycleBuffer -> waiting for SortMergeResultPartition lock; 2. ResultPartitionManager.releasePartition (SortMergeResultPartition lock) -> SortMergeSubpartitionReader.notifyDataAvailable -> SingleInputGate.notifyChannelNonEmpty -> waiting for SingleInputGate lock. The possibility of this deadlock is very small, but we should fix the bug as soon as possible. -- This message was sent by Atlassian Jira (v8.20.10#820010)