Hello Colin,

thanks for your reply. I use unclean.leader.election.enable to false on all 
controllers and brokers.

BR,
Julian

From: Colin McCabe <cmcc...@apache.org>
Date: Monday, 7. April 2025 at 23:54
To: dev@kafka.apache.org <dev@kafka.apache.org>
Subject: Re: Unexpected UNCLEAN leader election behaviour in KRaft vs. Zookeeper
Did you have unclean leader election enabled here?

best,
Colin

On Mon, Apr 7, 2025, at 11:49, Julian Bergner wrote:
> Hi,
> We primarily operate a Kafka cluster consisting of 3 brokers (IDs: 4,
> 5, 6) running Kraft in version 3.9.0. However, we tested the same
> scenario in a Zookeeper-managed cluster (3.9.0) to confirm behaviour
> consistency and identified a difference between the two.
> Consider the following scenario:
>
>   *   A topic (foo) with 1 partition and replication factor 2
>   *   Initial ISR for partition 0 is [6, 5] (leader is broker 6)
> When performing a partition reassignment from [6, 5] to [4, 5]
> (removing the current leader 6 and introducing broker 4—previously not
> part of ISR—as the new leader), we observe the following in KRaft:
>
>   *   An "UNCLEAN partition change" event is logged, despite having
> unclean.leader.election.enable explicitly set to false on all brokers
> and controllers.
>   *   The metric
> kafka.controller:type=ControllerStats,name=UncleanLeaderElectionsPerSec
> increments continuously and does not reset to 0.
> Relevant logs:
> DEBUG [QuorumController id=3] Node 6 has altered ISR for foo-0 to [5,
> 4]. (org.apache.kafka.controller.ReplicationControlManager)
> INFO [QuorumController id=3] AlterPartition request from node 6 for
> foo-0 completed the ongoing partition reassignment and triggered a
> leadership change. Returning NEW_LEADER_ELECTED.
> (org.apache.kafka.controller.ReplicationControlManager)
> INFO [QuorumController id=3] UNCLEAN partition change for foo-0 with
> topic ID 0nBbSaN0QWy_hmOnfNLNrg: replicas: [4, 5, 6] -> [4, 5],
> directories: [fwoolkW969wV61_D5mIIkg, TFl5RsmVwdUxDjTwUb202A,
> Osd62auxOyraCSUnTJXWTw] -> [fwoolkW969wV61_D5mIIkg,
> TFl5RsmVwdUxDjTwUb202A], isr: [6, 5] -> [5, 4], removingReplicas: [6]
> -> [], addingReplicas: [4] -> [], leader: 6 -> 4, leaderEpoch: 1 -> 2,
> partitionEpoch: 3 -> 4
> (org.apache.kafka.controller.ReplicationControlManager)
> INFO [QuorumController id=3] Replayed partition assignment change
> PartitionChangeRecord(partitionId=0, topicId=0nBbSaN0QWy_hmOnfNLNrg,
> isr=[5, 4], leader=4, replicas=[4, 5], removingReplicas=[],
> addingReplicas=[], leaderRecoveryState=-1,
> directories=[fwoolkW969wV61_D5mIIkg, TFl5RsmVwdUxDjTwUb202A],
> eligibleLeaderReplicas=null, lastKnownElr=null) for topic foo
> (org.apache.kafka.controller.ReplicationControlManager)
>
> Performing this exact same reassignment test in a Zookeeper-managed
> cluster did not result in any increment in the
> UncleanLeaderElectionsPerSec metric, nor was any similar "UNCLEAN
> partition change" log observed.
> Our expectation was that with unclean.leader.election.enable set to
> false, the controller should prevent any unclean leader elections and
> ISR should only include replicas that were previously synchronized.
>
> Could you confirm if this behaviour difference is expected in KRaft or
> if it might indicate an issue or misconfiguration?
> Thanks!
> Julian
>
> ________________________________
> Ultra Tendency International GmbH - Amtsgericht Stendal: HRB 26409 -
> Geschäftsführer/CEO: Dr. Robert Neumann
> August-Bebel-Str. 46, 39326 Colbitz, Germany -
> https://ultratendency.com - i...@ultratendency.com

Reply via email to