Re: Question on hello-samza (Kafka startup and shutdown)

Chris Riccomini Fri, 20 Feb 2015 08:55:24 -0800

Hey Chinmay,

It seems controlled.shutdown.enable=true is the default. Chinmay, did you
figure this out? I haven't seen this before, but I don't usually stop/start
within 5s of eachother.


One thing that you might have a look at is whether the Kafka or ZK
processes are living past bin/grid stop all. I have seen procs (NM and
Kafka usually) continue to be alive after `stop all` is executed. I think
this is because the stop scripts SIGTERM and return immediately. This
allows procs to do a cleaner shutdown. But if you stop/start quickly, you
might get some weirdness there. Try jps'ing in between the stop/start, and
check to make sure there's nothing still alive (wait in a loop until
everything shuts down cleanly, and kill -9 if it takes more than 60s, or
something).

Cheers,
Chris

On Thu, Feb 19, 2015 at 2:01 PM, Neha Narkhede <[email protected]>
wrote:

> Depending on the version of Kafka you're at, "controlled.shutdown.enable"
> should be set to true. If that's true and you always shutdown the broker
> cleanly (kill -15, not kill -9) and there are more than 1 replicas
> available, you should not see LeaderNotAvailable exceptions. If you kill
> the broker (kill -9) then Kafka does not get a chance to move the leaders
> away from the broken being shut down and the leader re-election can take
> some time leading to many LeaderNotAvailable exceptions.
>
> You can verify the replica availability as well as leader movement through
> the kafka-topics command before shutting down zookeeper.
>
> Thanks
> Neha
>
> On Thu, Feb 19, 2015 at 10:51 AM, Felix GV <[email protected]
> >
> wrote:
>
> > I'm not 100% sure, but I think this happens when ZK ephemeral znodes have
> > not had time to expire properly. When Kafka shuts down gracefully, it
> > should clean up its ephemeral nodes immediately (presumably, but that is
> > also an assumption... maybe it does have a short-coming in its graceful
> > shutdown logic). If Kafka gets killed improperly and bounced back up
> right
> > away, it cannot assume leadership properly because the ephemeral znodes
> of
> > the previous run are still there in ZK.
> >
> > I imagine Kafka could have some logic to deal with that better when it
> > gets fast-bounced... Alternatively, you may just have to wait a bit
> before
> > restarting Kafka after killing it.
> >
> > If anyone knows better, please correct me if I'm wrong.
> >
> > --
> >
> > Felix GV
> > Data Infrastructure Engineer
> > Distributed Data Systems
> > LinkedIn
> >
> > [email protected]
> > linkedin.com/in/felixgv
> >
> > ________________________________________
> > From: Chinmay Soman [[email protected]]
> > Sent: Thursday, February 19, 2015 10:44 AM
> > To: [email protected]
> > Subject: Question on hello-samza (Kafka startup and shutdown)
> >
> > Sending to a wider audience to know if anyone is also seeing this issue.
> >
> > It seems Kafka gets in a weird state everytime I do bin/grid stop all
> (and
> > then start all).
> >
> > I keep getting a LeaderNotAvailable exception on the producer side. It
> > seems this happens everytime Kafka hasn't been shut down properly. This
> > issue goes away if I use the following sequence:
> >
> > * bin/grid stop kafka
> > * bin/grid stop zookeeper (after like 5 seconds).
> >
> > (and then start everything).
> >
> > Has anyone else seen this ?
> >
> > --
> > Thanks and regards
> >
> > Chinmay Soman
> >
>

Re: Question on hello-samza (Kafka startup and shutdown)

Reply via email to