Hi, We are running Apache Kafka v2.7.0 in production in a 3-rack setup (3 AZs in a single AWS region) with the per-topic replication factor of 3 and the following global settings:
unclean.leader.election.enable=false min.insync.replicas=2 replica.lag.time.max.ms=10000 replica.selector.class=org.apache.kafka.common.replica.RackAwareReplicaSelector The Kafka producer is configured with acks=all on the client side. Recently we have experienced network performance degradation and partitioning in one of the AZs. For some of the hosted partitions this has resulted in their ISR lists shrinking down to just the leader running in that problematic zone and the partitions going offline. The brokers in that zone were ultimately shut down by the administrators. Nothing unexpected so far, but we would like to have a better understanding of the overall situation. First, because of the combination of our minimum insync-replica requirement and our client config, we expect that at least one of the remaining two brokers has all the data for this partition that was acknowledged by the leader before the ISR shrunk down to just the leader itself. Is this understanding correct? Second, once the leader is completely down a clean leader election is not possible. If we enabled unclean leader election for the affected topic, should we expect Kafka to select one of the remaining brokers in a *completely random* fashion, or does it try to take into account how far they have fallen behind the former leader? If it's the former, we face a 50% risk of losing some data, even though we know all acknowledged writes were replicated to one of the brokers that are still available. In summary: is there a risk of data loss in such a scenario? Is this risk avoidable and if so, what are the prerequisites? Cheers, -- Alex