I started a partition reassignment (this is a 8.1.1 cluster) some time
ago and it seems to be stuck. Partitions are no longer getting moved
around, but it seems like the cluster is operational otherwise. The
stuck nodes have a lot of 00000000000000000000.index files, and their
logs show errors like:
[2015-04-21 12:15:36,585] 3237789 [ReplicaFetcherThread-0-28] ERROR
kafka.server.ReplicaFetcherThread - [ReplicaFetcherThread-0-28], Error
for partition [pings,227] to broker 28:class kafka.common.UnknownException
I'm at a loss as to what I should be looking at. Any ideas?
Thanks,
Wes