[ 
https://issues.apache.org/jira/browse/KAFKA-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116072#comment-15116072
 ] 

Robert Joseph Evans commented on KAFKA-3148:
--------------------------------------------

I created some code that can show the issue on 
https://github.com/revans2/kafka/tree/KAFKA-3148

https://github.com/revans2/kafka/blob/KAFKA-3148/clients/src/test/java/org/apache/kafka/common/network/EchoServer.java#L85-L89
 is the sleep to cause the hang to happen all the time.  It may need to be 
longer, up to 4 seconds, to guarantee that it always happens on slower boxes.

https://github.com/revans2/kafka/blob/KAFKA-3148/clients/src/test/java/org/apache/kafka/common/network/SslSelectorTest.java#L118-L119

adds in a 60 second timeout on the test so it does not hang, but fails instead.

https://github.com/revans2/kafka/blob/KAFKA-3148/clients/src/test/java/org/apache/kafka/common/network/SslSelectorTest.java#L112-L117

is an ugly hack that shows once you are in this situation (in this case the 
test has run for 15 seconds) send another message to get things going again.

Typically what happens is that the client server intermix sends and receives 
until the renegotiate happens and the client finishes sending the 500 messages 
before the echo server wakes up from the sleep.  It renegotiates and then 
finishes echoing back the rest of the 500 messages without the client waking up 
to read anything.  A few seconds later the 15 second timeout happens and the 
extra send goes out and all of the buffered messages are read.

> SslSelectorTest.testRenegotiation can hang on slow boxes
> --------------------------------------------------------
>
>                 Key: KAFKA-3148
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3148
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 0.9.0.0
>            Reporter: Robert Joseph Evans
>
> The SslSelectorTest hangs very frequently (about 75% of the time) for me when 
> running on a very slow Linux Virtual Machine, but I can artificially 
> reproduce the issue on a fast box by inserting a sleep right before the 
> renegotiate happens in the EchoServer.
> It appears that after the renegotiate happens the client will not be woken up 
> to receive anything unless a send is first done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to