[jira] [Commented] (KAFKA-1286) Retry Can Block

Jay Kreps (JIRA) Wed, 05 Mar 2014 13:33:12 -0800

    [ 
https://issues.apache.org/jira/browse/KAFKA-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13921443#comment-13921443
 ]


Jay Kreps commented on KAFKA-1286:
----------------------------------

Jun:
1. They are equivalent for booleans but I wouldn't be opposed to changing it.
2. Done.

Neha:
I don't think so, right? The current logic is:
  boolean backingOff = batch.attempts > 0 && batch.lastAttempt + retryBackoffMs 
> now;
Basically that is saying "we are backing off if we have made one attempt and 
the backoff period will expire in the future".

But I may be confused...so many booleans!

> Retry Can Block 
> ----------------
>
>                 Key: KAFKA-1286
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1286
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: producer 
>            Reporter: Guozhang Wang
>         Attachments: KAFKA-1286.patch, KAFKA-1286_2014-03-04_11:04:32.patch, 
> KAFKA-1286_2014-03-04_15:14:49.patch, KAFKA-1286_2014-03-04_17:56:47.patch
>
>
> Under the following scenario the retry logic can block
> 1. The last broker's socket closed, sender.handleDisconnect() triggered, put 
> the node as disconnected.
> 2. In the next sender.run(), since the node is disconnected, remove the 
> partition from ready set, and call sender.initConnection(), which will not 
> throw exception.
> 3. So in this round of send, the only request it tries to send to is the 
> metadata request, to the last broker; and the sender will firstly try to 
> connect to that broker.
> 4. In selector.poll(), the finishConnect() call will throw exception, and in 
> handleDisconnects(), inFlight request's batches will be null since it is a 
> metadata request.
> 5. Now we will go back to 1, and loop forever. Note that this infinite loop 
> can be triggered even without calling producer.close.
> Also, we need to introduce the retry backoff config, otherwise the retries 
> will be exhausted too soon (in my tests 10 retries can be exhausted in about 
> 600ms).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (KAFKA-1286) Retry Can Block

Reply via email to