Hi, David,

Thanks for the KIP. +1

Jun

On Tue, Nov 22, 2022 at 3:43 PM Jason Gustafson <ja...@confluent.io.invalid>
wrote:

> Thanks, +1 from me. I suspect we might be in for at least one surprise with
> the re-implemented controller RPCs, but I agree the alternative has risks
> as well.
>
> Best,
> Jason
>
> On Mon, Nov 14, 2022 at 12:00 PM Colin McCabe <cmcc...@apache.org> wrote:
>
> > On Fri, Nov 11, 2022, at 08:59, David Arthur wrote:
> > > Thanks, Colin.
> > >
> > >> never start an upgrade without first verifying the quorum
> configuration
> > on the ZK-based brokers
> > >
> > > I agree that this is a pretty big benefit. I could imagine debugging
> > > and fixing connection problems mid-migration would be a big pain.
> > > Especially if you had some brokers correctly configured, and others
> > > not.
> > >
> > > Adding a heartbeat raises some questions about what to do if a broker
> > > goes into a bad state, or stops heartbeating, during a migration.
> > > However, I think the same is true for a registration based approach,
> > > so maybe it's not an increase in net complexity.
> > >
> >
> > Hi David,
> >
> > Yeah. I think the goal should be for the set of heartbeaters to match the
> > set of broker registrations under /brokers
> >
> > Obviously, people could add or remove brokers after the upgrade has
> begun,
> > but that's unavoidable, I think. We can at least ensure that at the time
> we
> > enter upgrade, all the brokers are ready.
> >
> > > I've replaced the ZK registration section with a new RPC and brief
> > > description. Please take a look.
> > >
> >
> > Thanks, David. With these changes it LGTM to me.
> >
> > +1 (binding)
> >
> > Colin
> >
> > > Thanks!
> > > David
> > >
> > > On Wed, Nov 9, 2022 at 5:46 PM Colin McCabe <cmcc...@apache.org>
> wrote:
> > >>
> > >> Hi David,
> > >>
> > >> Thanks for the response. Replies inline.
> > >>
> > >> On Wed, Nov 9, 2022, at 08:17, David Arthur wrote:
> > >> > Colin
> > >> >
> > >> >>  Maybe zk.metadata.migration.enable ?
> > >> >
> > >> > Done. I went with "zookeeper.metadata.migration.enable" since our
> > >> > other ZK configs start with "zookeeper.*"
> > >> >
> > >> >> SImilarly, for MigrationRecord: can we rename this to
> > ZkMigrationStateRecord? Then change MigrationState -> ZkMigrationState.
> > >> >
> > >> > Sure
> > >> >
> > >> >> With ZkMigrationStateRecord, one thing to keep in mind here is that
> > we will eventually compact all the metadata logs into a snapshot. That
> > snapshot will then have to keep alive the memory of the old migration. So
> > it is not really a matter of replaying the old metadata logs (probably)
> but
> > a matter of checking to see what the ZkMigrationState is, which I suppose
> > could be Optional<ZkMigrationState>. If it's not Optional.empty, we
> already
> > migrated / are migrating.
> > >> >
> > >> > Yea, makes sense.
> > >> >
> > >> >> For the /migration ZNode, is "last_update_time_ms" necessary? I
> > thought ZK already tracked this information in the mzxid of the znode?
> > >> >
> > >> > Yes, Jun pointed this out previously, I missed this update in the
> KIP.
> > >> > Fixed now.
> > >> >
> > >> >> It is true that technically it is only needed in UMR, but I would
> > still suggest including KRaftControllerId in LeaderAndIsrRequest because
> it
> > will make debugging much easier.
> > >> >>
> > >> >> I would suggest not implementing the topic deletion state machine,
> > but just deleting topics eagerly when in migration mode. We can implement
> > this behavior change by keying off of whether KRaftControllerId is
> present
> > in LeaderAndIsrRequest. On broker startup, we'll be sent a full
> > LeaderAndIsrRequest and can delete stray partitions whose IDs are not as
> > expected (again, this behavior change would only be for migration mode)
> > >> >
> > >> > Sounds good to me. Since this is somewhat of an implementation
> detail,
> > >> > do you think we need this included in the KIP?
> > >>
> > >> Yeah, maybe we don't need to go into the delete behavior here. But I
> > think the KIP should specify that we have KRaftControllerId in both
> > LeaderAndIsrRequest. That will allow us to implement this behavior
> > conditionally on zk-based brokers when in dual write mode.
> > >>
> > >> >
> > >> >> For existing KRaft controllers, will
> > kafka.controller:type=KafkaController,name=MigrationState show up as 4
> > (MigrationFinalized)? I assume this is true, but it would be good to
> spell
> > it out. Sorry if this is answered somewhere else.
> > >> >
> > >> > We discussed using 0 (None) as the value to report for original,
> > >> > un-migrated KRaft clusters. 4 (MigrationFinalized) would be for
> > >> > clusters which underwent a migration. I have some description of
> this
> > >> > in the table under "Migration Overview"
> > >> >
> > >>
> > >> I don't feel that strongly about this, but wouldn't it be a good idea
> > for MigrationState to have a different value for ZK-based clusters and
> > KRaft-based clusters? If you have a bunch of clusters and you take an
> > aggregate of this metric, it would be good to get a report of three
> numbers:
> > >> 1. unupgraded ZK
> > >> 2. in progress upgrades
> > >> 3. kraft
> > >>
> > >> I guess we could get that from examining some other metrics too,
> > though. Not sure, what do you think?
> > >>
> > >> >> As you point out, the ZK brokers being upgraded will need to
> contact
> > the KRaft quorum in order to forward requests to there, once we are in
> > migration mode. This raises a question: rather than changing the broker
> > registration, can we have those brokers send an RPC to the kraft
> controller
> > quorum instead? This would serve to confirm that they can reach the
> quorum.
> > Then the quorum could wait for all of the brokers to check in this way
> > before starting the migration (It would know all the brokers by looking
> at
> > /brokers)
> > >> >
> > >> > One of the motivations I had for putting the migration details in
> the
> > >> > broker registration is that it removes ordering constraints between
> > >> > the brokers and controllers when starting the migration. If we set
> the
> > >> > brokers in migration mode before the KRaft quorum is available, they
> > >> > will just carry on until the KRaft controller takes over the
> > >> > controller leadership and starts sending UMR/LISR.
> > >> >
> > >>
> > >> I agree that any scheme that requires complicated startup ordering is
> > not a good idea. I was more thinking about having the zk-based brokers
> > periodically send some kind of "I'm here" upgrade heartbeat to the
> > controller quorum.
> > >>
> > >> > I wonder if we could infer reachability of the new KRaft controller
> by
> > >> > the brokers through ApiVersions? Once taking leadership, the active
> > >> > KRaft controller could monitor its NodeApiVersions to see that each
> of
> > >> > the ZK brokers had reached it. Would this be enough to verify
> > >> > reachability?
> > >> >
> > >>
> > >> If the new KRaft controller has already taken leadership, it's too
> late
> > to do anything about brokers that can't reach it. Unless we want to
> > implement automatic rollback to the pre-upgrade state, which sounds quite
> > complex and error prone.
> > >>
> > >> So, to sum up: my proposal was that the ZK based brokers periodically
> > (like maybe every 10 seconds) try to contact the quorum with an RPC
> > containing their IBP and readiness state. I guess the main negative thing
> > is that it would be more network traffic (but probably not significantly
> > more.) The main positive thing is that we'd never start an upgrade
> without
> > first verifying the quorum configuration on the ZK-based brokers. A
> neutral
> > thing (neither good nor bad, I think) is that we'd have a new RPC rather
> > than a znode change.
> > >>
> > >> best,
> > >> Colin
> > >>
> > >>
> > >> > -David
> > >> >
> > >> >
> > >> > On Tue, Nov 8, 2022 at 6:15 PM Colin McCabe <cmcc...@apache.org>
> > wrote:
> > >> >>
> > >> >> Hi David,
> > >> >>
> > >> >> Looks great. Some questions:
> > >> >>
> > >> >> I agree with Jun that it would be good to rename
> > metadata.migration.enable to something more zk-specific. Maybe
> > zk.metadata.migration.enable ?
> > >> >>
> > >> >> SImilarly, for MigrationRecord: can we rename this to
> > ZkMigrationStateRecord? Then change MigrationState -> ZkMigrationState.
> > >> >>
> > >> >> With ZkMigrationStateRecord, one thing to keep in mind here is that
> > we will eventually compact all the metadata logs into a snapshot. That
> > snapshot will then have to keep alive the memory of the old migration. So
> > it is not really a matter of replaying the old metadata logs (probably)
> but
> > a matter of checking to see what the ZkMigrationState is, which I suppose
> > could be Optional<ZkMigrationState>. If it's not Optional.empty, we
> already
> > migrated / are migrating.
> > >> >>
> > >> >> For the /migration ZNode, is "last_update_time_ms" necessary? I
> > thought ZK already tracked this information in the mzxid of the znode?
> > >> >>
> > >> >> It is true that technically it is only needed in UMR, but I would
> > still suggest including KRaftControllerId in LeaderAndIsrRequest because
> it
> > will make debugging much easier.
> > >> >>
> > >> >> I would suggest not implementing the topic deletion state machine,
> > but just deleting topics eagerly when in migration mode. We can implement
> > this behavior change by keying off of whether KRaftControllerId is
> present
> > in LeaderAndIsrRequest. On broker startup, we'll be sent a full
> > LeaderAndIsrRequest and can delete stray partitions whose IDs are not as
> > expected (again, this behavior change would only be for migration mode)
> > >> >>
> > >> >> For existing KRaft controllers, will
> > kafka.controller:type=KafkaController,name=MigrationState show up as 4
> > (MigrationFinalized)? I assume this is true, but it would be good to
> spell
> > it out. Sorry if this is answered somewhere else.
> > >> >>
> > >> >> As you point out, the ZK brokers being upgraded will need to
> contact
> > the KRaft quorum in order to forward requests to there, once we are in
> > migration mode. This raises a question: rather than changing the broker
> > registration, can we have those brokers send an RPC to the kraft
> controller
> > quorum instead? This would serve to confirm that they can reach the
> quorum.
> > Then the quorum could wait for all of the brokers to check in this way
> > before starting the migration (It would know all the brokers by looking
> at
> > /brokers)
> > >> >>
> > >> >> best,
> > >> >> Colin
> > >> >>
> > >> >> On Tue, Nov 8, 2022, at 07:23, David Arthur wrote:
> > >> >> > Hello everyone, I'd like to start the vote on KIP-866.
> > >> >> >
> > >> >> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-866+ZooKeeper+to+KRaft+Migration
> > >> >> >
> > >> >> > Thanks!
> > >> >> > David Arthur
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > David Arthur
> > >
> > >
> > >
> > > --
> > > David Arthur
> >
>

Reply via email to