Hi 刘建刚,

Could you explain how did you fix the problem for your case? Did you modify 
Flink code to use `IdleStateHandler`?

Piotrek

> On 13 Feb 2020, at 11:10, 刘建刚 <liujiangangp...@gmail.com> wrote:
> 
> Thanks for all the help. Following the advice, I have fixed the problem.
> 
>> 2020年2月13日 下午6:05,Zhijiang <wangzhijiang...@aliyun.com 
>> <mailto:wangzhijiang...@aliyun.com>> 写道:
>> 
>> Thanks for reporting this issue and I also agree with the below analysis. 
>> Actually we encountered the same issue several years ago and solved it also 
>> via the netty idle handler.
>> 
>> Let's trace it via the ticket [1] as the following step.
>> 
>> [1] https://issues.apache.org/jira/browse/FLINK-16030 
>> <https://issues.apache.org/jira/browse/FLINK-16030>
>> 
>> Best,
>> Zhijiang
>> 
>> ------------------------------------------------------------------
>> From:张光辉 <beggingh...@gmail.com <mailto:beggingh...@gmail.com>>
>> Send Time:2020 Feb. 12 (Wed.) 22:19
>> To:Benchao Li <libenc...@gmail.com <mailto:libenc...@gmail.com>>
>> Cc:刘建刚 <liujiangangp...@gmail.com <mailto:liujiangangp...@gmail.com>>; user 
>> <user@flink.apache.org <mailto:user@flink.apache.org>>
>> Subject:Re: Encountered error while consuming partitions
>> 
>> Network can fail in many ways, sometimes pretty subtle (e.g. high ratio 
>> packet loss). 
>> 
>> The problem is that the long tcp connection between netty client and server 
>> is lost, then the server failed to send message to the client, and shut down 
>> the channel. The Netty Client  does not know that the connection has been 
>> disconnected, so it has been waiting. 
>> 
>> To detect long tcp connection alive on netty client and server, we should 
>> have two ways: tcp keepalives and heartbeat.
>> Tcp keepalives is 2 hours by default. When the error occurs, if you continue 
>> to wait for 2 hours, the netty client will trigger exception and enter 
>> failover recovery.
>> If you want to detect long tcp connection quickly, netty provides 
>> IdleStateHandler which it use ping-pang mechanism. If netty client send 
>> continuously n ping message and receive no one pang message, then trigger 
>> exception.
>>  <mailto:libenc...@pku.edu.cn>
>> 
> 

Reply via email to