Jianbin Chen created KAFKA-20104:
------------------------------------
Summary: Inquiry about migrating from ZooKeeper to KRaft
Key: KAFKA-20104
URL: https://issues.apache.org/jira/browse/KAFKA-20104
Project: Kafka
Issue Type: Wish
Components: core
Affects Versions: 3.9.1
Environment: rocky9.4 kafka 3.9.1
Reporter: Jianbin Chen
Hi everyone,
I’m trying to migrate a test cluster from ZooKeeper to KRaft in-place (i.e.,
not provisioning three new controller-only nodes first and then pointing the
existing brokers to them). I hit a problem and would appreciate any pointers.
What I did
- Enabled zookeeper.metadata.migration.enable on each existing broker and set
the controller quorum settings so each broker acts as a controller+broker
(process.roles=broker,controller).
- Rolled the three nodes.
Relevant broker configuration (each broker has similar config; example shown):
```
process.roles=broker,controller
[node.id|http://node.id/]=7
[broker.id|http://broker.id/]=7
zookeeper.metadata.migration.enable=true
controller.quorum.voters=7@broker1:9093,6@broker2:9093,4@broker3:9093
controller.quorum.bootstrap.servers=broker1:9093,broker2:9093,broker3:9093
zookeeper.connect=zk1:2181,zk2:2181,zk3:2181
controller.listener.names=CONTROLLER
[group.initial.rebalance.delay.ms|http://group.initial.rebalance.delay.ms/]=0
listeners=SSL://ip:9092,PLAINTEXT://ip:9192,CONTROLLER://ip:9093
listener.security.protocol.map=SSL:SSL,PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT
```
Observed behavior
After rolling restart, the controller logs show the quorum is ready for
migration, but the controller repeatedly logs that no brokers are known to
KRaft:
```
[2026-01-27 17:47:15,626] INFO [KRaftMigrationDriver id=7] Controller Quorum is
ready for Zk to KRaft migration. Now waiting for ZK brokers.
(org.apache.kafka.metadata.migration.KRaftMigrationDriver)
[2026-01-27 17:47:15,627] INFO [KRaftMigrationDriver id=7] 7 transitioning from
WAIT_FOR_CONTROLLER_QUORUM to WAIT_FOR_BROKERS state
(org.apache.kafka.metadata.migration.KRaftMigrationDriver)
[2026-01-27 17:47:15,627] INFO [KRaftMigrationDriver id=7] No brokers are known
to KRaft, waiting for brokers to register.
(org.apache.kafka.metadata.migration.KRaftMigrationDriver)
[2026-01-27 17:47:15,726] INFO [KRaftMigrationDriver id=7] No brokers are known
to KRaft, waiting for brokers to register.
(org.apache.kafka.metadata.migration.KRaftMigrationDriver)
[2026-01-27 17:47:15,925] INFO [KRaftMigrationDriver id=7] No brokers are known
to KRaft, waiting for brokers to register.
```
It appears that the controller quorum becomes ready for migration, but the
migration driver repeatedly logs "No brokers are known to KRaft, waiting for
brokers to register." and does not make progress.
My current understanding is as follows:
- Migration from ZooKeeper to KRaft normally involves standing up a separate
controller-only KRaft cluster first, then updating the existing brokers to
point to that controller cluster (via controller.quorum.bootstrap.servers) and
enabling zookeeper.metadata.migration.enable.
- Performing an in-place migration (having the existing brokers also act as
controllers) seems risky because controller quorum elections require a majority
of controller nodes. For example, with topics having replication.factor=2, you
may need to restart two brokers to form the new controller quorum, which would
make RF=2 topics unavailable during the migration.
- Therefore I am unsure whether my understanding is correct (i.e., in-place
migration is unsafe or unsupported for production-like setups) or whether Kafka
actually supports in-place migration and I have a configuration error.
I would greatly appreciate it if you could confirm which is the case. If
in-place migration is supported, could you please advise what configuration or
sequence I am missing so that brokers register with KRaft correctly? If
in-place migration is not recommended, could you recommend the safest procedure
to migrate a test cluster while minimizing downtime for producers and consumers?
Thank you very much for your time and assistance.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)