Hi, We primarily operate a Kafka cluster consisting of 3 brokers (IDs: 4, 5, 6) running Kraft in version 3.9.0. However, we tested the same scenario in a Zookeeper-managed cluster (3.9.0) to confirm behaviour consistency and identified a difference between the two. Consider the following scenario:
* A topic (foo) with 1 partition and replication factor 2 * Initial ISR for partition 0 is [6, 5] (leader is broker 6) When performing a partition reassignment from [6, 5] to [4, 5] (removing the current leader 6 and introducing broker 4—previously not part of ISR—as the new leader), we observe the following in KRaft: * An "UNCLEAN partition change" event is logged, despite having unclean.leader.election.enable explicitly set to false on all brokers and controllers. * The metric kafka.controller:type=ControllerStats,name=UncleanLeaderElectionsPerSec increments continuously and does not reset to 0. Relevant logs: DEBUG [QuorumController id=3] Node 6 has altered ISR for foo-0 to [5, 4]. (org.apache.kafka.controller.ReplicationControlManager) INFO [QuorumController id=3] AlterPartition request from node 6 for foo-0 completed the ongoing partition reassignment and triggered a leadership change. Returning NEW_LEADER_ELECTED. (org.apache.kafka.controller.ReplicationControlManager) INFO [QuorumController id=3] UNCLEAN partition change for foo-0 with topic ID 0nBbSaN0QWy_hmOnfNLNrg: replicas: [4, 5, 6] -> [4, 5], directories: [fwoolkW969wV61_D5mIIkg, TFl5RsmVwdUxDjTwUb202A, Osd62auxOyraCSUnTJXWTw] -> [fwoolkW969wV61_D5mIIkg, TFl5RsmVwdUxDjTwUb202A], isr: [6, 5] -> [5, 4], removingReplicas: [6] -> [], addingReplicas: [4] -> [], leader: 6 -> 4, leaderEpoch: 1 -> 2, partitionEpoch: 3 -> 4 (org.apache.kafka.controller.ReplicationControlManager) INFO [QuorumController id=3] Replayed partition assignment change PartitionChangeRecord(partitionId=0, topicId=0nBbSaN0QWy_hmOnfNLNrg, isr=[5, 4], leader=4, replicas=[4, 5], removingReplicas=[], addingReplicas=[], leaderRecoveryState=-1, directories=[fwoolkW969wV61_D5mIIkg, TFl5RsmVwdUxDjTwUb202A], eligibleLeaderReplicas=null, lastKnownElr=null) for topic foo (org.apache.kafka.controller.ReplicationControlManager) Performing this exact same reassignment test in a Zookeeper-managed cluster did not result in any increment in the UncleanLeaderElectionsPerSec metric, nor was any similar "UNCLEAN partition change" log observed. Our expectation was that with unclean.leader.election.enable set to false, the controller should prevent any unclean leader elections and ISR should only include replicas that were previously synchronized. Could you confirm if this behaviour difference is expected in KRaft or if it might indicate an issue or misconfiguration? Thanks! Julian ________________________________ Ultra Tendency International GmbH - Amtsgericht Stendal: HRB 26409 - Geschäftsführer/CEO: Dr. Robert Neumann August-Bebel-Str. 46, 39326 Colbitz, Germany - https://ultratendency.com - i...@ultratendency.com