Hello folks We have currently have a cluster of 20 nodes running Kafka 0.8.1.1 serving a large volume of production traffic.
I'm bringing up another 20 nodes with Kafka 0.8.2.0 in the same cluster, currently serving a test topic with 32 partitions spread across the 20 new nodes, with replication factor 2. I had a leader and an in-sync replica for all partitions. I recently restarted the new nodes. After the restart, all the partitions of the test topic still have leaders, but only 2 out of 32 partitions have ISRs. I enabled debug logging for one of the brokers that wasn't syncing, but there doesn't seem to be anything useful in there. It doesn't seem to be fetching from the leader at all. I stopped it again and deleted its data directory completely and restarted, but still no luck. Anybody seen this before? Anything in the log that I can provide to debug further? Regards Albert