[ https://issues.apache.org/jira/browse/FLINK-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420965#comment-15420965 ]
ASF GitHub Bot commented on FLINK-4021: --------------------------------------- Github user uce commented on a diff in the pull request: https://github.com/apache/flink/pull/2141#discussion_r74760105 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/netty/PartitionRequestClientHandler.java --- @@ -292,7 +296,11 @@ else if (bufferListener.waitForBuffer(bufferProvider, bufferOrEvent)) { return false; } else if (bufferProvider.isDestroyed()) { - return false; --- End diff -- We usually have a white space between keywords like `if` or `else`: ```java if (isStagedBuffer) { return true; } else { return false; } ``` In this case, you can simplify the return value to `return isStagedBuffer`. Same for the other place where you use this. > Problem of setting autoread for netty channel when more tasks sharing the > same Tcp connection > --------------------------------------------------------------------------------------------- > > Key: FLINK-4021 > URL: https://issues.apache.org/jira/browse/FLINK-4021 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination > Affects Versions: 1.0.2 > Reporter: Zhijiang Wang > Assignee: Zhijiang Wang > > More than one task sharing the same Tcp connection for shuffling data. > If the downstream task said as "A" has no available memory segment to read > netty buffer from network, it will set autoread as false for the channel. > When the task A is failed or has available segments again, the netty handler > will be notified to process the staging buffers first, then reset autoread as > true. But in some scenarios, the autoread will not be set as true any more. > That is when processing staging buffers, first find the corresponding input > channel for the buffer, if the task for that input channel is failed, the > decodeMsg method in PartitionRequestClientHandler will return false, that > means setting autoread as true will not be done anymore. > In summary, if one task "A" sets the autoread as false because of no > available segments, and resulting in some staging buffers. If another task > "B" is failed by accident corresponding to one staging buffer. When task A > trys to reset autoread as true, the process can not work because of task B > failed. > I have fixed this problem in our application by adding one boolean parameter > in decodeBufferOrEvent method to distinguish whether this method is invoke by > netty IO thread channel read or staged message handler task in > PartitionRequestClientHandler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)