Hi, Before Kafka 1.1.0, If the unclean leader election is enabled and if there are no ISRs, the leader is set to -1 and ISR will be empty. During upgrade, If you have single replica partitions or if all replicas goes out of ISR, then we get into this situation.
>From Kafka 0.11.0.0, Unclean leader election is disabled by default. With this change, Kafka can not elect a new leader from empty ISR/leader. What is the replication factor? Was unclean election enabled (It enabled by default in 0.10.0.1)? With sufficient replication factor and healthy ISR, we may not see this issue. On Mon, Apr 23, 2018 at 12:29 PM, Enrique Medina Montenegro < e.medin...@gmail.com> wrote: > What type of storage do you have for your setup? > > > En 23 de abril de 2018 8:04:46 a. m. Mika Linnanoja < > mika.linnan...@rovio.com> escribió: > > Hello, >> >> Last week I upgraded one relatively large kafka (EC2, 10 brokers, ~30 TB >> data, 100-300 Mbps in/out per instance) 0.10.0.1 cluster to 1.0, and saw >> some issues. >> >> Out of ~100 topics with 2..20 partitions each, 9 partitions in 8 topics >> become "unavailable" across 3 brokers. The leader was shown as -1 and ISR >> was empty. Java service using 0.10.0.1 clients was unable to send any data >> to these partitions so it got dropped. >> >> The partitions were shown on the `kafka/bin/kafka-topics.sh --zookeeper >> <zk's> --unavailable-partitions --describe` output. Nothing special about >> these partitions, among them were big ones (hundreds of gigs) and tiny >> ones >> (megabytes). >> >> The fix was to set up the unclean leader elections and restart one of the >> affected brokers in each partition: `kafka/bin/kafka-configs.sh >> --zookeeper >> <zk's> --entity-type topics --entity-name <topicname> --add-config >> unclean.leader.election.enable=true --alter`. >> >> Anyone seen something like this, how to avoid it when next upgrading >> perchance? Maybe it would be better if said cluster got no traffic during >> upgrade, but we cannot have a maintenance break as everything is up 24/7. >> Cluster is for analytics data, some of which is consumed in real-time >> applications, mostly by secor. >> >> BR, >> Mika >> >> -- >> *Mika Linnanoja* >> Senior Cloud Engineer >> Games Technology >> Rovio Entertainment Corp >> Keilaranta 7, FIN - 02150 Espoo, Finland >> mika.linnan...@rovio.com >> www.rovio.com >> > > > >