[ 
https://issues.apache.org/jira/browse/FLINK-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15423984#comment-15423984
 ] 

Zhijiang Wang commented on FLINK-4021:
--------------------------------------

Thank you for advice, it is indeed simple by 'return isStagedBuffer' directly. 

> Problem of setting autoread for netty channel when more tasks sharing the 
> same Tcp connection
> ---------------------------------------------------------------------------------------------
>
>                 Key: FLINK-4021
>                 URL: https://issues.apache.org/jira/browse/FLINK-4021
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination
>    Affects Versions: 1.0.2
>            Reporter: Zhijiang Wang
>            Assignee: Zhijiang Wang
>
> More than one task sharing the same Tcp connection for shuffling data.
> If the downstream task said as "A" has no available memory segment to read 
> netty buffer from network, it will set autoread as false for the channel.
> When the task A is failed or has available segments again, the netty handler 
> will be notified to process the staging buffers first, then reset autoread as 
> true. But in some scenarios, the autoread will not be set as true any more.
> That is when processing staging buffers, first find the corresponding input 
> channel for the buffer, if the task for that input channel is failed, the 
> decodeMsg method in PartitionRequestClientHandler will return false, that 
> means setting autoread as true will not be done anymore.
> In summary,  if one task "A" sets the autoread as false because of no 
> available segments, and resulting in some staging buffers. If another task 
> "B" is failed by accident corresponding to one staging buffer. When task A 
> trys to reset autoread as true, the process can not work because of task B 
> failed.
> I have fixed this problem in our application by adding one boolean parameter 
> in decodeBufferOrEvent method to distinguish whether this method is invoke by 
> netty IO thread channel read or staged message handler task in 
> PartitionRequestClientHandler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to