[ https://issues.apache.org/jira/browse/FLINK-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15424183#comment-15424183 ]
Zhijiang Wang commented on FLINK-4021: -------------------------------------- got it, i will modify the return and add the test to PR branch this week. > Problem of setting autoread for netty channel when more tasks sharing the > same Tcp connection > --------------------------------------------------------------------------------------------- > > Key: FLINK-4021 > URL: https://issues.apache.org/jira/browse/FLINK-4021 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination > Affects Versions: 1.0.2 > Reporter: Zhijiang Wang > Assignee: Zhijiang Wang > > More than one task sharing the same Tcp connection for shuffling data. > If the downstream task said as "A" has no available memory segment to read > netty buffer from network, it will set autoread as false for the channel. > When the task A is failed or has available segments again, the netty handler > will be notified to process the staging buffers first, then reset autoread as > true. But in some scenarios, the autoread will not be set as true any more. > That is when processing staging buffers, first find the corresponding input > channel for the buffer, if the task for that input channel is failed, the > decodeMsg method in PartitionRequestClientHandler will return false, that > means setting autoread as true will not be done anymore. > In summary, if one task "A" sets the autoread as false because of no > available segments, and resulting in some staging buffers. If another task > "B" is failed by accident corresponding to one staging buffer. When task A > trys to reset autoread as true, the process can not work because of task B > failed. > I have fixed this problem in our application by adding one boolean parameter > in decodeBufferOrEvent method to distinguish whether this method is invoke by > netty IO thread channel read or staged message handler task in > PartitionRequestClientHandler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)