Hi,

I observed some error messages / exceptions while running partition
reassignment on kafka 0.8.1.1 cluster. Being fairly new to this system I'm
not sure if these indicate serious failures or transient problems, or if
manual intervention is needed.

I used kafka-reassign-partitions.sh to reassign partitions from brokers
{143,155,155,93} to {143,155,115,68} on a healthy (?) cluster. Right now
one partition has just two replicas in the ISR and a number of partitions
is left with 4 partitions in ISR even though replication factor is 3. Logs
show a few zookeeper timeouts, but there were no GC pauses anywhere near
the session timeout. Zookeeper itself seems healthy and not overloaded,
with exception of regular CPU spikes, probably related to snapshots.

I cleaned the log lines a little bit for brevity.

First example: https://gist.github.com/knowak/a682afc1545fdeb836a1
Second one with two similar stack traces:
https://gist.github.com/knowak/6398be433d869d8141e5
Third one, many many of these:
https://gist.github.com/knowak/e78301259b74841702ae
Fourth: https://gist.github.com/knowak/1fbde5ca90d8f1924141
Fifth:https://gist.github.com/knowak/57fdcb75b3dc7c626893

Hints?


Thanks,
Karol

Reply via email to