Re: Failed partition reassignment

Karol Nowak Tue, 02 Dec 2014 15:15:07 -0800

I don't have it reproduced in a sandbox environment, but it's already
happened twice on that cluster, so it's a safe bet to say it's reproducible
in that setup. Are there special metrics / events that I should capture to
make debugging this easier?



Thanks,
Karol

On Tue, Dec 2, 2014 at 11:20 PM, Jun Rao <jun...@gmail.com> wrote:

> Is there an easy way to reproduce the issues that you saw?
>
> Thanks,
>
> Jun
>
> On Mon, Dec 1, 2014 at 6:31 AM, Karol Nowak <gryw...@gmail.com> wrote:
>
> > Hi,
> >
> > I observed some error messages / exceptions while running partition
> > reassignment on kafka 0.8.1.1 cluster. Being fairly new to this system
> I'm
> > not sure if these indicate serious failures or transient problems, or if
> > manual intervention is needed.
> >
> > I used kafka-reassign-partitions.sh to reassign partitions from brokers
> > {143,155,155,93} to {143,155,115,68} on a healthy (?) cluster. Right now
> > one partition has just two replicas in the ISR and a number of partitions
> > is left with 4 partitions in ISR even though replication factor is 3.
> Logs
> > show a few zookeeper timeouts, but there were no GC pauses anywhere near
> > the session timeout. Zookeeper itself seems healthy and not overloaded,
> > with exception of regular CPU spikes, probably related to snapshots.
> >
> > I cleaned the log lines a little bit for brevity.
> >
> > First example: https://gist.github.com/knowak/a682afc1545fdeb836a1
> > Second one with two similar stack traces:
> > https://gist.github.com/knowak/6398be433d869d8141e5
> > Third one, many many of these:
> > https://gist.github.com/knowak/e78301259b74841702ae
> > Fourth: https://gist.github.com/knowak/1fbde5ca90d8f1924141
> > Fifth:https://gist.github.com/knowak/57fdcb75b3dc7c626893
> >
> > Hints?
> >
> >
> > Thanks,
> > Karol
> >
>



-- 
pozdrawiam
Karol Nowak
http://knowak.wordpress.com

Re: Failed partition reassignment

Reply via email to