[ https://issues.apache.org/jira/browse/FLINK-16404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
hanhan.zhang updated FLINK-16404: --------------------------------- Attachment: image-2021-02-22-15-29-55-096.png > Avoid caching buffers for blocked input channels before barrier alignment > ------------------------------------------------------------------------- > > Key: FLINK-16404 > URL: https://issues.apache.org/jira/browse/FLINK-16404 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Network > Reporter: Zhijiang > Assignee: Yingjie Cao > Priority: Major > Labels: pull-request-available > Fix For: 1.11.0 > > Attachments: image-2021-02-22-15-27-57-983.png, > image-2021-02-22-15-29-55-096.png > > Time Spent: 10m > Remaining Estimate: 0h > > One motivation of this issue is for reducing the in-flight data in the case > of back pressure to speed up checkpoint. The current default exclusive > buffers per channel is 2. If we reduce it to 0 and increase somewhat floating > buffers for compensation, it might cause deadlock problem because all the > floating buffers might be requested away by some blocked input channels and > never recycled until barrier alignment. > In order to solve above deadlock concern, we can make some logic changes on > both sender and receiver sides. > * Sender side: It should revoke previous received credit after sending > checkpoint barrier, that means it would not send any following buffers until > receiving new credits. > * Receiver side: The respective channel releases the requested floating > buffers if barrier is received from the network. After barrier alignment, it > would request floating buffers for the channels with positive backlog, and > notify the sender side of available credits. Then the sender can continue > transporting the buffers. > Based on above changes, we can also remove the `BufferStorage` component > completely, because the receiver would never reading buffers for blocked > channels. Another possible benefit is that the floating buffers might be more > properly made use of before barrier alignment. > The only side effect would bring somehow cold setup after barrier alignment. > That means the sender side has to wait for credit feedback to transport data > just after alignment, which would impact on delay and network throughput. But > considering the checkpoint interval not too short in general, so the above > side effect can be ignored in practice. We can further verify it via existing > micro-benchmark. > After this ticket done, we still can not set exclusive buffers to zero ATM, > there exists another deadlock issue which would be solved separately in > another ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005)