I'm not 100% sure, but I think this happens when ZK ephemeral znodes have not 
had time to expire properly. When Kafka shuts down gracefully, it should clean 
up its ephemeral nodes immediately (presumably, but that is also an 
assumption... maybe it does have a short-coming in its graceful shutdown 
logic). If Kafka gets killed improperly and bounced back up right away, it 
cannot assume leadership properly because the ephemeral znodes of the previous 
run are still there in ZK.

I imagine Kafka could have some logic to deal with that better when it gets 
fast-bounced... Alternatively, you may just have to wait a bit before 
restarting Kafka after killing it.

If anyone knows better, please correct me if I'm wrong.

--

Felix GV
Data Infrastructure Engineer
Distributed Data Systems
LinkedIn

f...@linkedin.com
linkedin.com/in/felixgv

________________________________________
From: Chinmay Soman [chinmay.cere...@gmail.com]
Sent: Thursday, February 19, 2015 10:44 AM
To: dev@samza.apache.org
Subject: Question on hello-samza (Kafka startup and shutdown)

Sending to a wider audience to know if anyone is also seeing this issue.

It seems Kafka gets in a weird state everytime I do bin/grid stop all  (and
then start all).

I keep getting a LeaderNotAvailable exception on the producer side. It
seems this happens everytime Kafka hasn't been shut down properly. This
issue goes away if I use the following sequence:

* bin/grid stop kafka
* bin/grid stop zookeeper (after like 5 seconds).

(and then start everything).

Has anyone else seen this ?

--
Thanks and regards

Chinmay Soman

Reply via email to