Hello,

I’m trying to understand how replication and leader election work in Kafka,
and I have a couple of questions.

I know that Kafka does not rely on a majority-quorum approach. Instead,
there is an ISR set, which consists of replicas that the current leader
considers sufficiently up-to-date (this is partially managed through
replica.lag.time.max.ms). The leader election algorithm itself (
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/PartitionStateMachine.scala#L549)
is quite straightforward.

As long as the leader has not “kicked out” a replica from the ISR due to
excessive lag, that replica can become the new leader if the current leader
failsa and control plane is aware of which replicas belong to the ISR.

Here’s the scenario I’m confused about: Suppose the leader(Broker #1) dies
before it can update the ISR to remove lagging replicas. Let’s say:

- Broker #2 fully fetched the most recent messages from the leader,
- Broker #3 is still lagging,
- Broker #4 only pulled metadata.

>From the control plane’s perspective, since the leader did not remove them
from the ISR, all these replicas appear up-to-date enough to take over as
leader—even though they each have different states.

I know Kafka has a Replication Reconciliation process, but I’m not entirely
clear on how it determines the “source of truth” for offsets. Does it
require some sort of majoritary quorum to decide whose offsets are valid
(via offset metadata polling from other replicas)? Or does the replica with
the most recent offsets (and a certain epoch) automatically become the new
leader, forcing the others to reconcile by removing any records that don’t
align with the new leader’s log—even if a “majority” of brokers have those
records?

I’d really appreciate any clarification on how this scenario is handled, as
well as any relevant documentation or code references. Thanks!

Reply via email to