[ https://issues.apache.org/jira/browse/KAFKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15035665#comment-15035665 ]
The Data Lorax commented on KAFKA-1894: --------------------------------------- I'm running into this issue and struggling to find a way around it - if the Kafka cluster is unavailable the KafkaConsumer.poll() call can block indefinitely - and does not even enter an interruptible state, which means there is no way of recovering, short of thread.stop(). Would be good to move this into a more imminent release or at least have the thread enter an interruptible state within the loop. > Avoid long or infinite blocking in the consumer > ----------------------------------------------- > > Key: KAFKA-1894 > URL: https://issues.apache.org/jira/browse/KAFKA-1894 > Project: Kafka > Issue Type: Sub-task > Components: consumer > Reporter: Jay Kreps > Assignee: Jason Gustafson > Fix For: 0.10.0.0 > > > The new consumer has a lot of loops that look something like > {code} > while(!isThingComplete()) > client.poll(); > {code} > This occurs both in KafkaConsumer but also in NetworkClient.completeAll. > These retry loops are actually mostly the behavior we want but there are > several cases where they may cause problems: > - In the case of a hard failure we may hang for a long time or indefinitely > before realizing the connection is lost. > - In the case where the cluster is malfunctioning or down we may retry > forever. > It would probably be better to give a timeout to these. The proposed approach > would be to add something like retry.time.ms=60000 and only continue retrying > for that period of time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)