[ 
https://issues.apache.org/jira/browse/KAFKA-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1286:
---------------------------------

    Attachment: KAFKA-1286.patch

> Retry Can Block 
> ----------------
>
>                 Key: KAFKA-1286
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1286
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: producer 
>            Reporter: Guozhang Wang
>         Attachments: KAFKA-1286.patch
>
>
> Under the following scenario the retry logic can block
> 1. The last broker's socket closed, sender.handleDisconnect() triggered, put 
> the node as disconnected.
> 2. In the next sender.run(), since the node is disconnected, remove the 
> partition from ready set, and call sender.initConnection(), which will not 
> throw exception.
> 3. So in this round of send, the only request it tries to send to is the 
> metadata request, to the last broker; and the sender will firstly try to 
> connect to that broker.
> 4. In selector.poll(), the finishConnect() call will throw exception, and in 
> handleDisconnects(), inFlight request's batches will be null since it is a 
> metadata request.
> 5. Now we will go back to 1, and loop forever. Note that this infinite loop 
> can be triggered even without calling producer.close.
> Also, we need to introduce the retry backoff config, otherwise the retries 
> will be exhausted too soon (in my tests 10 retries can be exhausted in about 
> 600ms).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to