[ 
https://issues.apache.org/jira/browse/FLINK-31610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715433#comment-17715433
 ] 

Weijie Guo commented on FLINK-31610:
------------------------------------

>> One idea I can come up with is to make requesting buffer from global pool 
>> and registering of the global pool availability callback atomic.

Thanks [~kevin.cyj] for bringing this idea. I agree with you that the mechanism 
of interaction between {{LocalBufferPool}} and {{NetworkBufferPool}} is too 
complicated.

Since the callback can be triggered by multiple threads (the current thread or 
threads of other tasks), it leads to code complexity and bugs (such as 
FLINK-29298). If we turn requesting buffers from the global pool and 
registering callbacks into atomic operations, then {{AvailabilityStatus}} class 
and related logic can be removed. The only possible concern is that after doing 
this, the current thread will no longer be allowed to trigger the callback 
immediately. I don't see the side effects of doing this at the moment, but we 
do need to be careful to double confirm. 

[~akalash] and [~fanrui], I'd like to hear your thoughts on this proposal.

> Refactoring of LocalBufferPool
> ------------------------------
>
>                 Key: FLINK-31610
>                 URL: https://issues.apache.org/jira/browse/FLINK-31610
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Network
>    Affects Versions: 1.17.0
>            Reporter: Anton Kalashnikov
>            Priority: Major
>
> FLINK-31293 bug highlighted the issue with the internal mutual consistency of 
> different fields in LocalBufferPool. ex.:
> -  `numberOfRequestedOverdraftMemorySegments`
> -  `numberOfRequestedMemorySegments`
> -  `availableMemorySegment`
> -  `currentPoolSize`
> Most of the problem was fixed already(I hope) but it is a good idea to 
> reorganize the code in such a way that all invariants between all fields 
> inside will be clearly determined and difficult to break.
> As one example I can propose getting rid of 
> numberOfRequestedOverdraftMemorySegments and using existing 
> numberOfRequestedMemorySegments instead. That means:
> - the pool will be available when `!availableMemorySegments.isEmpty() && 
> unavailableSubpartitionsCount == 0`
> - we don't request a new `ordinary` buffer when 
> `numberOfRequestedMemorySegments >=  currentPoolSize` but we request the 
> overdraft buffer instead
> - `setNumBuffers` should work automatically without any changes
> I think we can come up with a couple of such improvements to simplify the 
> code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to