Hi David, Looks great. Some questions:
I agree with Jun that it would be good to rename metadata.migration.enable to something more zk-specific. Maybe zk.metadata.migration.enable ? SImilarly, for MigrationRecord: can we rename this to ZkMigrationStateRecord? Then change MigrationState -> ZkMigrationState. With ZkMigrationStateRecord, one thing to keep in mind here is that we will eventually compact all the metadata logs into a snapshot. That snapshot will then have to keep alive the memory of the old migration. So it is not really a matter of replaying the old metadata logs (probably) but a matter of checking to see what the ZkMigrationState is, which I suppose could be Optional<ZkMigrationState>. If it's not Optional.empty, we already migrated / are migrating. For the /migration ZNode, is "last_update_time_ms" necessary? I thought ZK already tracked this information in the mzxid of the znode? It is true that technically it is only needed in UMR, but I would still suggest including KRaftControllerId in LeaderAndIsrRequest because it will make debugging much easier. I would suggest not implementing the topic deletion state machine, but just deleting topics eagerly when in migration mode. We can implement this behavior change by keying off of whether KRaftControllerId is present in LeaderAndIsrRequest. On broker startup, we'll be sent a full LeaderAndIsrRequest and can delete stray partitions whose IDs are not as expected (again, this behavior change would only be for migration mode) For existing KRaft controllers, will kafka.controller:type=KafkaController,name=MigrationState show up as 4 (MigrationFinalized)? I assume this is true, but it would be good to spell it out. Sorry if this is answered somewhere else. As you point out, the ZK brokers being upgraded will need to contact the KRaft quorum in order to forward requests to there, once we are in migration mode. This raises a question: rather than changing the broker registration, can we have those brokers send an RPC to the kraft controller quorum instead? This would serve to confirm that they can reach the quorum. Then the quorum could wait for all of the brokers to check in this way before starting the migration (It would know all the brokers by looking at /brokers) best, Colin On Tue, Nov 8, 2022, at 07:23, David Arthur wrote: > Hello everyone, I'd like to start the vote on KIP-866. > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-866+ZooKeeper+to+KRaft+Migration > > Thanks! > David Arthur