Colin > Maybe zk.metadata.migration.enable ?
Done. I went with "zookeeper.metadata.migration.enable" since our other ZK configs start with "zookeeper.*" > SImilarly, for MigrationRecord: can we rename this to ZkMigrationStateRecord? > Then change MigrationState -> ZkMigrationState. Sure > With ZkMigrationStateRecord, one thing to keep in mind here is that we will > eventually compact all the metadata logs into a snapshot. That snapshot will > then have to keep alive the memory of the old migration. So it is not really > a matter of replaying the old metadata logs (probably) but a matter of > checking to see what the ZkMigrationState is, which I suppose could be > Optional<ZkMigrationState>. If it's not Optional.empty, we already migrated / > are migrating. Yea, makes sense. > For the /migration ZNode, is "last_update_time_ms" necessary? I thought ZK > already tracked this information in the mzxid of the znode? Yes, Jun pointed this out previously, I missed this update in the KIP. Fixed now. > It is true that technically it is only needed in UMR, but I would still > suggest including KRaftControllerId in LeaderAndIsrRequest because it will > make debugging much easier. > > I would suggest not implementing the topic deletion state machine, but just > deleting topics eagerly when in migration mode. We can implement this > behavior change by keying off of whether KRaftControllerId is present in > LeaderAndIsrRequest. On broker startup, we'll be sent a full > LeaderAndIsrRequest and can delete stray partitions whose IDs are not as > expected (again, this behavior change would only be for migration mode) Sounds good to me. Since this is somewhat of an implementation detail, do you think we need this included in the KIP? > For existing KRaft controllers, will > kafka.controller:type=KafkaController,name=MigrationState show up as 4 > (MigrationFinalized)? I assume this is true, but it would be good to spell it > out. Sorry if this is answered somewhere else. We discussed using 0 (None) as the value to report for original, un-migrated KRaft clusters. 4 (MigrationFinalized) would be for clusters which underwent a migration. I have some description of this in the table under "Migration Overview" > As you point out, the ZK brokers being upgraded will need to contact the > KRaft quorum in order to forward requests to there, once we are in migration > mode. This raises a question: rather than changing the broker registration, > can we have those brokers send an RPC to the kraft controller quorum instead? > This would serve to confirm that they can reach the quorum. Then the quorum > could wait for all of the brokers to check in this way before starting the > migration (It would know all the brokers by looking at /brokers) One of the motivations I had for putting the migration details in the broker registration is that it removes ordering constraints between the brokers and controllers when starting the migration. If we set the brokers in migration mode before the KRaft quorum is available, they will just carry on until the KRaft controller takes over the controller leadership and starts sending UMR/LISR. I wonder if we could infer reachability of the new KRaft controller by the brokers through ApiVersions? Once taking leadership, the active KRaft controller could monitor its NodeApiVersions to see that each of the ZK brokers had reached it. Would this be enough to verify reachability? -David On Tue, Nov 8, 2022 at 6:15 PM Colin McCabe <cmcc...@apache.org> wrote: > > Hi David, > > Looks great. Some questions: > > I agree with Jun that it would be good to rename metadata.migration.enable to > something more zk-specific. Maybe zk.metadata.migration.enable ? > > SImilarly, for MigrationRecord: can we rename this to ZkMigrationStateRecord? > Then change MigrationState -> ZkMigrationState. > > With ZkMigrationStateRecord, one thing to keep in mind here is that we will > eventually compact all the metadata logs into a snapshot. That snapshot will > then have to keep alive the memory of the old migration. So it is not really > a matter of replaying the old metadata logs (probably) but a matter of > checking to see what the ZkMigrationState is, which I suppose could be > Optional<ZkMigrationState>. If it's not Optional.empty, we already migrated / > are migrating. > > For the /migration ZNode, is "last_update_time_ms" necessary? I thought ZK > already tracked this information in the mzxid of the znode? > > It is true that technically it is only needed in UMR, but I would still > suggest including KRaftControllerId in LeaderAndIsrRequest because it will > make debugging much easier. > > I would suggest not implementing the topic deletion state machine, but just > deleting topics eagerly when in migration mode. We can implement this > behavior change by keying off of whether KRaftControllerId is present in > LeaderAndIsrRequest. On broker startup, we'll be sent a full > LeaderAndIsrRequest and can delete stray partitions whose IDs are not as > expected (again, this behavior change would only be for migration mode) > > For existing KRaft controllers, will > kafka.controller:type=KafkaController,name=MigrationState show up as 4 > (MigrationFinalized)? I assume this is true, but it would be good to spell it > out. Sorry if this is answered somewhere else. > > As you point out, the ZK brokers being upgraded will need to contact the > KRaft quorum in order to forward requests to there, once we are in migration > mode. This raises a question: rather than changing the broker registration, > can we have those brokers send an RPC to the kraft controller quorum instead? > This would serve to confirm that they can reach the quorum. Then the quorum > could wait for all of the brokers to check in this way before starting the > migration (It would know all the brokers by looking at /brokers) > > best, > Colin > > On Tue, Nov 8, 2022, at 07:23, David Arthur wrote: > > Hello everyone, I'd like to start the vote on KIP-866. > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-866+ZooKeeper+to+KRaft+Migration > > > > Thanks! > > David Arthur -- David Arthur