[jira] [Commented] (FLINK-31610) Refactoring of LocalBufferPool

Yingjie Cao (Jira) Thu, 13 Apr 2023 03:42:23 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-31610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17711825#comment-17711825
 ]


Yingjie Cao commented on FLINK-31610:
-------------------------------------

Sorry for the delayed reply.

Totally agree that we should simply the LocalBufferPool implementation. IMO, 
the most complicated part of the buffer pool comes from the mechanism of 
interaction between global pool and local pool.

One idea I can come up with is to make requesting buffer from global pool and 
registering of the global pool availability callback atomic, that is, we has 
such a method in the global NetworkBufferPool, like 
tryRequestMemorySegment(AvailableCallback callback), if we can request a 
buffer, we do not register the callback, otherwise, we register the callback.

In the current implementation, there are too many branches, for example, if we 
can not request a buffer from global pool, we register a callback, that 
callback can be called immediately if the global pool availability is already 
completed and the local pool will try to request buffer from global pool again 
but the request can fail, if fail, it needs to register the callback again.

What do you think? 

> Refactoring of LocalBufferPool
> ------------------------------
>
>                 Key: FLINK-31610
>                 URL: https://issues.apache.org/jira/browse/FLINK-31610
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Network
>    Affects Versions: 1.17.0
>            Reporter: Anton Kalashnikov
>            Priority: Major
>
> FLINK-31293 bug highlighted the issue with the internal mutual consistency of 
> different fields in LocalBufferPool. ex.:
> -  `numberOfRequestedOverdraftMemorySegments`
> -  `numberOfRequestedMemorySegments`
> -  `availableMemorySegment`
> -  `currentPoolSize`
> Most of the problem was fixed already(I hope) but it is a good idea to 
> reorganize the code in such a way that all invariants between all fields 
> inside will be clearly determined and difficult to break.
> As one example I can propose getting rid of 
> numberOfRequestedOverdraftMemorySegments and using existing 
> numberOfRequestedMemorySegments instead. That means:
> - the pool will be available when `!availableMemorySegments.isEmpty() && 
> unavailableSubpartitionsCount == 0`
> - we don't request a new `ordinary` buffer when 
> `numberOfRequestedMemorySegments >=  currentPoolSize` but we request the 
> overdraft buffer instead
> - `setNumBuffers` should work automatically without any changes
> I think we can come up with a couple of such improvements to simplify the 
> code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-31610) Refactoring of LocalBufferPool

Reply via email to