[ https://issues.apache.org/jira/browse/FLINK-31610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715433#comment-17715433 ]
Weijie Guo commented on FLINK-31610: ------------------------------------ >> One idea I can come up with is to make requesting buffer from global pool >> and registering of the global pool availability callback atomic. Thanks [~kevin.cyj] for bringing this idea. I agree with you that the mechanism of interaction between {{LocalBufferPool}} and {{NetworkBufferPool}} is too complicated. Since the callback can be triggered by multiple threads (the current thread or threads of other tasks), it leads to code complexity and bugs (such as FLINK-29298). If we turn requesting buffers from the global pool and registering callbacks into atomic operations, then {{AvailabilityStatus}} class and related logic can be removed. The only possible concern is that after doing this, the current thread will no longer be allowed to trigger the callback immediately. I don't see the side effects of doing this at the moment, but we do need to be careful to double confirm. [~akalash] and [~fanrui], I'd like to hear your thoughts on this proposal. > Refactoring of LocalBufferPool > ------------------------------ > > Key: FLINK-31610 > URL: https://issues.apache.org/jira/browse/FLINK-31610 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network > Affects Versions: 1.17.0 > Reporter: Anton Kalashnikov > Priority: Major > > FLINK-31293 bug highlighted the issue with the internal mutual consistency of > different fields in LocalBufferPool. ex.: > - `numberOfRequestedOverdraftMemorySegments` > - `numberOfRequestedMemorySegments` > - `availableMemorySegment` > - `currentPoolSize` > Most of the problem was fixed already(I hope) but it is a good idea to > reorganize the code in such a way that all invariants between all fields > inside will be clearly determined and difficult to break. > As one example I can propose getting rid of > numberOfRequestedOverdraftMemorySegments and using existing > numberOfRequestedMemorySegments instead. That means: > - the pool will be available when `!availableMemorySegments.isEmpty() && > unavailableSubpartitionsCount == 0` > - we don't request a new `ordinary` buffer when > `numberOfRequestedMemorySegments >= currentPoolSize` but we request the > overdraft buffer instead > - `setNumBuffers` should work automatically without any changes > I think we can come up with a couple of such improvements to simplify the > code. -- This message was sent by Atlassian Jira (v8.20.10#820010)