[ https://issues.apache.org/jira/browse/FLINK-31610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17711825#comment-17711825 ]
Yingjie Cao commented on FLINK-31610: ------------------------------------- Sorry for the delayed reply. Totally agree that we should simply the LocalBufferPool implementation. IMO, the most complicated part of the buffer pool comes from the mechanism of interaction between global pool and local pool. One idea I can come up with is to make requesting buffer from global pool and registering of the global pool availability callback atomic, that is, we has such a method in the global NetworkBufferPool, like tryRequestMemorySegment(AvailableCallback callback), if we can request a buffer, we do not register the callback, otherwise, we register the callback. In the current implementation, there are too many branches, for example, if we can not request a buffer from global pool, we register a callback, that callback can be called immediately if the global pool availability is already completed and the local pool will try to request buffer from global pool again but the request can fail, if fail, it needs to register the callback again. What do you think? > Refactoring of LocalBufferPool > ------------------------------ > > Key: FLINK-31610 > URL: https://issues.apache.org/jira/browse/FLINK-31610 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network > Affects Versions: 1.17.0 > Reporter: Anton Kalashnikov > Priority: Major > > FLINK-31293 bug highlighted the issue with the internal mutual consistency of > different fields in LocalBufferPool. ex.: > - `numberOfRequestedOverdraftMemorySegments` > - `numberOfRequestedMemorySegments` > - `availableMemorySegment` > - `currentPoolSize` > Most of the problem was fixed already(I hope) but it is a good idea to > reorganize the code in such a way that all invariants between all fields > inside will be clearly determined and difficult to break. > As one example I can propose getting rid of > numberOfRequestedOverdraftMemorySegments and using existing > numberOfRequestedMemorySegments instead. That means: > - the pool will be available when `!availableMemorySegments.isEmpty() && > unavailableSubpartitionsCount == 0` > - we don't request a new `ordinary` buffer when > `numberOfRequestedMemorySegments >= currentPoolSize` but we request the > overdraft buffer instead > - `setNumBuffers` should work automatically without any changes > I think we can come up with a couple of such improvements to simplify the > code. -- This message was sent by Atlassian Jira (v8.20.10#820010)