[jira] [Commented] (FLINK-4021) Problem of setting autoread for netty channel when more tasks sharing the same Tcp connection

ASF GitHub Bot (JIRA) Mon, 15 Aug 2016 06:23:48 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420965#comment-15420965
 ]


ASF GitHub Bot commented on FLINK-4021:
---------------------------------------

Github user uce commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2141#discussion_r74760105
  
    --- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/netty/PartitionRequestClientHandler.java
 ---
    @@ -292,7 +296,11 @@ else if (bufferListener.waitForBuffer(bufferProvider, 
bufferOrEvent)) {
                                                return false;
                                        }
                                        else if (bufferProvider.isDestroyed()) {
    -                                           return false;
    --- End diff --
    
    We usually have a white space between keywords like `if` or `else`:
    ```java
    if (isStagedBuffer) {
        return true;
    } else {
        return false;
    }
    ```
    
    In this case, you can simplify the return value to `return isStagedBuffer`. 
Same for the other place where you use this.


> Problem of setting autoread for netty channel when more tasks sharing the 
> same Tcp connection
> ---------------------------------------------------------------------------------------------
>
>                 Key: FLINK-4021
>                 URL: https://issues.apache.org/jira/browse/FLINK-4021
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination
>    Affects Versions: 1.0.2
>            Reporter: Zhijiang Wang
>            Assignee: Zhijiang Wang
>
> More than one task sharing the same Tcp connection for shuffling data.
> If the downstream task said as "A" has no available memory segment to read 
> netty buffer from network, it will set autoread as false for the channel.
> When the task A is failed or has available segments again, the netty handler 
> will be notified to process the staging buffers first, then reset autoread as 
> true. But in some scenarios, the autoread will not be set as true any more.
> That is when processing staging buffers, first find the corresponding input 
> channel for the buffer, if the task for that input channel is failed, the 
> decodeMsg method in PartitionRequestClientHandler will return false, that 
> means setting autoread as true will not be done anymore.
> In summary,  if one task "A" sets the autoread as false because of no 
> available segments, and resulting in some staging buffers. If another task 
> "B" is failed by accident corresponding to one staging buffer. When task A 
> trys to reset autoread as true, the process can not work because of task B 
> failed.
> I have fixed this problem in our application by adding one boolean parameter 
> in decodeBufferOrEvent method to distinguish whether this method is invoke by 
> netty IO thread channel read or staged message handler task in 
> PartitionRequestClientHandler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-4021) Problem of setting autoread for netty channel when more tasks sharing the same Tcp connection

Reply via email to