Hi, I observed some error messages / exceptions while running partition reassignment on kafka 0.8.1.1 cluster. Being fairly new to this system I'm not sure if these indicate serious failures or transient problems, or if manual intervention is needed.
I used kafka-reassign-partitions.sh to reassign partitions from brokers {143,155,155,93} to {143,155,115,68} on a healthy (?) cluster. Right now one partition has just two replicas in the ISR and a number of partitions is left with 4 partitions in ISR even though replication factor is 3. Logs show a few zookeeper timeouts, but there were no GC pauses anywhere near the session timeout. Zookeeper itself seems healthy and not overloaded, with exception of regular CPU spikes, probably related to snapshots. I cleaned the log lines a little bit for brevity. First example: https://gist.github.com/knowak/a682afc1545fdeb836a1 Second one with two similar stack traces: https://gist.github.com/knowak/6398be433d869d8141e5 Third one, many many of these: https://gist.github.com/knowak/e78301259b74841702ae Fourth: https://gist.github.com/knowak/1fbde5ca90d8f1924141 Fifth:https://gist.github.com/knowak/57fdcb75b3dc7c626893 Hints? Thanks, Karol