We found our consumer stopped working after this exception occurred.
Can the consumer recover from such an exception?

Regards,

Libo


-----Original Message-----
From: Florin Trofin [mailto:ftro...@adobe.com] 
Sent: Tuesday, July 16, 2013 4:20 PM
To: users@kafka.apache.org
Subject: Re: ConsumerRebalanceFailedException

Yes, I think these are two separate issues.

F.

On 7/16/13 11:32 AM, "Joel Koshy" <jjkosh...@gmail.com> wrote:

>From a user's perspective, ConsumerRebalanceException is a bit cryptic 
>-I think the other thread was to provide a more informative message and 
>also be able to recover when a broker does come up (fixed in 
>KAFKA-969).
>
>Thanks,
>
>Joel
>
>On Tue, Jul 16, 2013 at 11:04 AM, Vaibhav Puranik <vpura...@gmail.com>
>wrote:
>> Thank you Joel.
>>
>> In a different but related thread, somebody is asking to rename the 
>> exception as NoBrokerAvailableExcption. But given the description 
>> above, the exception seems to be named appropriately.
>>
>> Regards,
>> Vaibhav
>>
>>
>> On Tue, Jul 16, 2013 at 12:05 AM, Joel Koshy <jjkosh...@gmail.com>
>>wrote:
>>
>>> Yes - rebalance => consumers trying to coordinate through ZK.
>>> Rebalances can happen when one or more of the following happen:
>>> - a consumed topic partition appears or disappears - i.e., if a 
>>> broker comes or goes.
>>> - a consumer instance in the group comes or goes "goes" could also 
>>> be triggered by session expirations in zookeeper - typically caused 
>>> by client-side GC or flaky connections to zookeeper.
>>>
>>> On Mon, Jul 15, 2013 at 10:15 AM, Vaibhav Puranik 
>>> <vpura...@gmail.com>
>>> wrote:
>>> > Hi all,
>>> >
>>> > We have a small Kafka cluster (0.7.1 - 3 nodes) in EC2. The load 
>>> > is
>>>about
>>> > 200 million events per day, each being few kilobytes. We have a
>>>single
>>> node
>>> > zookeeper.
>>> >
>>> > Yesterday suddenly our Kafka clients started throwing the 
>>> > following
>>> > exception:
>>> > java.lang.RuntimeException:
>>> kafka.common.ConsumerRebalanceFailedException:
>>> > 
>>>CONSUMER_GROUP_NAME_ip-00-00-00-00.ec2.internal-1373821190828-5f78e9a
>>>f
>>> > can't rebalance after 4 retries
>>> >     at
>>> >
>>> 
>>>com.gumgum.kafka.consumer.KafkaTemplate.executeWithBatch(KafkaTemplat
>>>e.j
>>>ava:59)
>>> >     at
>>> >
>>> 
>>>com.gumgum.storm.fileupload.GenericKafkaSpout.nextTuple(GenericKafkaS
>>>pou
>>>t.java:73)
>>> >     at
>>> >
>>> 
>>>backtype.storm.daemon.executor$fn__3968$fn__4009$fn__4010.invoke(exec
>>>uto
>>>r.clj:433)
>>> >     at backtype.storm.util$async_loop$fn__465.invoke(util.clj:377)
>>> >
>>> > None of the Kafka clients (ConsumerConenctor class) would start. 
>>> > They
>>> would
>>> > fail with the exception.
>>> >
>>> > We tried restarting the clilents, restarting the zookeeper as well.
>>>But
>>> > finally it all started working when we restarted all of our kafka
>>> brokers.
>>> > We didn't lose any data because producers (going directly to the
>>>brokers
>>> > through a load balancer) were working fine.
>>> >
>>> > I tried googling this issue and looks like lot of people have 
>>> > faced
>>>it,
>>> but
>>> > couldn't get anything concrete.
>>> >
>>> > Given this, I have two questions:
>>> >
>>> > It will be nice if you can tell me why this can happen or point me
>>>to a
>>> > link where I can understand it better. What does Consumer 
>>> > Rebalancing
>>> mean?
>>> > Does that mean consumers are trying to coordinate amongst 
>>> > themselves
>>> using
>>> > Zookeeper?
>>> >
>>> > On a separate note, are there any JMX parameters I need to be
>>>monitoring
>>> to
>>> > make sure that my kafka cluster is healthy? How can I keep watch 
>>> > on
>>>my
>>> > kafka cluster?
>>> >
>>> > Regards,
>>> > Vaibhav Puranik
>>> > GumGum
>>>

Reply via email to