Replica selection in unclean leader election and min.insync.replicas=2

Oleksandr Shulgin Mon, 21 Jun 2021 03:33:57 -0700

Hi,

We are running Apache Kafka v2.7.0 in production in a 3-rack setup (3 AZs
in a single AWS region)
with the per-topic replication factor of 3 and the following global
settings:


unclean.leader.election.enable=false
min.insync.replicas=2
replica.lag.time.max.ms=10000
replica.selector.class=org.apache.kafka.common.replica.RackAwareReplicaSelector

The Kafka producer is configured with acks=all on the client side.

Recently we have experienced network performance degradation and
partitioning in one of the AZs.
For some of the hosted partitions this has resulted in their ISR lists
shrinking down to just the leader
running in that problematic zone and the partitions going offline.  The
brokers in that zone were ultimately
shut down by the administrators.

Nothing unexpected so far, but we would like to have a better understanding
of the overall situation.

First, because of the combination of our minimum insync-replica requirement
and our client config,
we expect that at least one of the remaining two brokers has all the data
for this partition that was
acknowledged by the leader before the ISR shrunk down to just the leader
itself.
Is this understanding correct?

Second, once the leader is completely down a clean leader election is not
possible.  If we enabled
unclean leader election for the affected topic, should we expect Kafka to
select one of the remaining
brokers in a *completely random* fashion, or does it try to take into
account how far they have fallen
behind the former leader?

If it's the former, we face a 50% risk of losing some data, even though we
know all acknowledged
writes were replicated to one of the brokers that are still available.

In summary: is there a risk of data loss in such a scenario?  Is this risk
avoidable and if so, what are
the prerequisites?


Cheers,
--
Alex

Replica selection in unclean leader election and min.insync.replicas=2

Reply via email to