You can do the following (1) check if there is any error in the controller and the state-change log, (2) use the per partition offset lag JMX in the follower to see if the follower is making good progress.
Thanks, Jun On Tue, Dec 2, 2014 at 3:13 PM, Karol Nowak <gryw...@gmail.com> wrote: > I don't have it reproduced in a sandbox environment, but it's already > happened twice on that cluster, so it's a safe bet to say it's reproducible > in that setup. Are there special metrics / events that I should capture to > make debugging this easier? > > > Thanks, > Karol > > On Tue, Dec 2, 2014 at 11:20 PM, Jun Rao <jun...@gmail.com> wrote: > > > Is there an easy way to reproduce the issues that you saw? > > > > Thanks, > > > > Jun > > > > On Mon, Dec 1, 2014 at 6:31 AM, Karol Nowak <gryw...@gmail.com> wrote: > > > > > Hi, > > > > > > I observed some error messages / exceptions while running partition > > > reassignment on kafka 0.8.1.1 cluster. Being fairly new to this system > > I'm > > > not sure if these indicate serious failures or transient problems, or > if > > > manual intervention is needed. > > > > > > I used kafka-reassign-partitions.sh to reassign partitions from brokers > > > {143,155,155,93} to {143,155,115,68} on a healthy (?) cluster. Right > now > > > one partition has just two replicas in the ISR and a number of > partitions > > > is left with 4 partitions in ISR even though replication factor is 3. > > Logs > > > show a few zookeeper timeouts, but there were no GC pauses anywhere > near > > > the session timeout. Zookeeper itself seems healthy and not overloaded, > > > with exception of regular CPU spikes, probably related to snapshots. > > > > > > I cleaned the log lines a little bit for brevity. > > > > > > First example: https://gist.github.com/knowak/a682afc1545fdeb836a1 > > > Second one with two similar stack traces: > > > https://gist.github.com/knowak/6398be433d869d8141e5 > > > Third one, many many of these: > > > https://gist.github.com/knowak/e78301259b74841702ae > > > Fourth: https://gist.github.com/knowak/1fbde5ca90d8f1924141 > > > Fifth:https://gist.github.com/knowak/57fdcb75b3dc7c626893 > > > > > > Hints? > > > > > > > > > Thanks, > > > Karol > > > > > > > > > -- > pozdrawiam > Karol Nowak > http://knowak.wordpress.com >