Jun, I've updated the doc with clarification of combined mode under
the ApiVersionsResponse section. I also added your suggestion to
rename the config to something more explicit. Colin had some feedback
in the VOTE thread which I've also incorporated.

Thanks!
David

On Tue, Nov 8, 2022 at 1:18 PM Jun Rao <j...@confluent.io.invalid> wrote:
>
> Hi, David,
>
> Thanks for the reply.
>
> 20/21. Thanks for the explanation. The suggestion sounds good. Could you
> include that in the doc?
>
> 40. metadata.migration.enable: We may do future metadata migrations within
> KRaft. Could we make the name more specific to ZK migration like
> zookeeper.metadata.migration.enable?
>
> Thanks,
>
> Jun
>
> On Tue, Nov 8, 2022 at 9:47 AM David Arthur <mum...@gmail.com> wrote:
>
> > Ah, sorry about that, you're right. Since we won't support ZK
> > migrations in combined mode, is this issue avoided?
> >
> > Essentially, we would only set ZkMigrationReady in ApiVersionsResponse if
> >
> > * process.roles=controller
> > * kafka.metadata.migration.enabled=true
> >
> > Also, the following would be an invalid config that should prevent startup:
> >
> > * process.roles=broker,controller
> > * kafka.metadata.migration.enabled=true
> >
> > Does this seem reasonable?
> >
> > Thanks!
> > David
> >
> > On Tue, Nov 8, 2022 at 11:12 AM Jun Rao <j...@confluent.io.invalid> wrote:
> > >
> > > Hi, David,
> > >
> > > I am not sure that we are fully settled on the following.
> > >
> > > 20/21. Since separate listeners are optional, it seems that the broker
> > > can't distinguish between ApiVersion requests coming from the client or
> > > other brokers. This means the clients will get ZkMigrationReady in the
> > > ApiVersion response, which is weird.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Nov 8, 2022 at 7:18 AM David Arthur <mum...@gmail.com> wrote:
> > >
> > > > Thanks for the discussion everyone, I'm going to move ahead with the
> > > > vote for this KIP.
> > > >
> > > > -David
> > > >
> > > > On Thu, Nov 3, 2022 at 1:20 PM Jun Rao <j...@confluent.io.invalid>
> > wrote:
> > > > >
> > > > > Hi, David,
> > > > >
> > > > > Thanks for the reply.
> > > > >
> > > > > 20/21 Yes, but separate listeners are optional. It's possible for the
> > > > nodes
> > > > > to use a single port for both client and server side communications.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Thu, Nov 3, 2022 at 9:59 AM David Arthur
> > > > > <david.art...@confluent.io.invalid> wrote:
> > > > >
> > > > > > 20/21, in combined mode we still have a separate listener for the
> > > > > > controller APIs, e.g.,
> > > > > >
> > > > > > listeners=PLAINTEXT://:9092,CONTROLLER://:9093
> > > > > >
> > > > > > inter.broker.listener.name=PLAINTEXT
> > > > > >
> > > > > > controller.listener.names=CONTROLLER
> > > > > >
> > > > > > advertised.listeners=PLAINTEXT://localhost:9092
> > > > > >
> > > > > >
> > > > > >
> > > > > > Clients still talk to the broker through the advertised listener,
> > and
> > > > only
> > > > > > brokers and other controllers will communicate over the controller
> > > > > > listener.
> > > > > >
> > > > > > 40. Sounds good, I updated the KIP
> > > > > >
> > > > > > Thanks!
> > > > > > David
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Nov 3, 2022 at 12:14 PM Jun Rao <j...@confluent.io.invalid>
> > > > wrote:
> > > > > >
> > > > > > > Hi, David,
> > > > > > >
> > > > > > > Thanks for the reply.
> > > > > > >
> > > > > > > 20/21. When KRaft runs in the combined mode, does a controller
> > know
> > > > > > whether
> > > > > > > an ApiRequest is from a client or another broker?
> > > > > > >
> > > > > > > 40. Adding a "None" state sounds reasonable.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > > On Thu, Nov 3, 2022 at 8:39 AM David Arthur
> > > > > > > <david.art...@confluent.io.invalid> wrote:
> > > > > > >
> > > > > > > > Jun,
> > > > > > > >
> > > > > > > > 20/21 If we use a tagged field, then I don't think clients
> > need to
> > > > be
> > > > > > > > concerned with it, right?. In ApiVersionsResponse sent by
> > brokers
> > > > to
> > > > > > > > clients, this field would be omitted. Clients can't connect
> > > > directly to
> > > > > > > the
> > > > > > > > KRaft controller nodes. Also, we have a precedent of sending
> > > > controller
> > > > > > > > node state between controllers through ApiVersions
> > > > ("metadata.version"
> > > > > > > > feature), so I think it fits well.
> > > > > > > >
> > > > > > > > 40. For new KRaft clusters, we could add another state to
> > indicate
> > > > it
> > > > > > was
> > > > > > > > not migrated from ZK, but created as a KRaft cluster. Maybe
> > > > > > > > "kafka.controller:type=KafkaController,name=MigrationState" =>
> > > > "None" ?
> > > > > > > We
> > > > > > > > could also omit that metric for unmigrated clusters, but I'm
> > not a
> > > > fan
> > > > > > of
> > > > > > > > using the absence of a value to signal something.
> > > > > > > >
> > > > > > > > -----
> > > > > > > >
> > > > > > > > Akhilesh, thanks for reviewing the KIP!
> > > > > > > >
> > > > > > > > 1. MigrationState and MetadataType are mostly the same on the
> > > > > > controller,
> > > > > > > > but we do have the "MigratingZkData" state that seems useful to
> > > > report
> > > > > > > as a
> > > > > > > > metric. Aside from looking at logs, observing the controller in
> > > > this
> > > > > > > state
> > > > > > > > is the only way to see how long its taking to copy data from
> > ZK.
> > > > > > > >
> > > > > > > > "KRaftMode" instead of "MigrationFinalized" is similar to Jun's
> > > > > > question
> > > > > > > > about non-migrated clusters. I think it's useful to have a
> > distinct
> > > > > > > > MigrationState for clusters that have been migrated and those
> > that
> > > > were
> > > > > > > > never migrated. This does mean we'll report the MigrationState
> > long
> > > > > > after
> > > > > > > > the migration is complete, but we can drop these metrics in 4.0
> > > > once ZK
> > > > > > > is
> > > > > > > > removed.
> > > > > > > >
> > > > > > > > 2. The "ZkMigrationReady" will indicate that the controller has
> > > > > > > > "kafka.metadata.migration.enable" _and_ the ZK configs set. We
> > need
> > > > > > some
> > > > > > > > way to indicate that the whole quorum is correctly configured
> > to
> > > > handle
> > > > > > > the
> > > > > > > > migration so we don't failover to a controller that's not
> > > > configured
> > > > > > for
> > > > > > > > ZK. Did I understand your question correctly?
> > > > > > > >
> > > > > > > > 3. Yea, good idea. While the KRaft controller has
> > > > > > > > MigrationState=MigrationIneligible, we could also report
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > "kafka.controller:type=KafkaController,name=MigrationInelgibleBrokerCount".
> > > > > > > > It might be useful to report ineligible controllers as well
> > since
> > > > that
> > > > > > > can
> > > > > > > > prevent the migration from starting.
> > > > > > > >
> > > > > > > > 4. I think I covered this in "Incompatible Brokers". We
> > effectively
> > > > > > fence
> > > > > > > > these brokers by not sending them metadata RPCs.
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > > David
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Nov 1, 2022 at 3:58 PM Akhilesh Chaganti <
> > > > > > akhilesh....@gmail.com
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi David,
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Thanks for the KIP. I have some questions/suggestions.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 1) I see two new metrics:
> > > > > > > > > kafka.controller:type=KafkaController,name=MetadataType and
> > > > > > > > > kafka.controller:type=KafkaController,name=MigrationState.
> > Won't
> > > > the
> > > > > > > > second
> > > > > > > > > metric already cover the cases of the first metric? Also,
> > > > instead of
> > > > > > > > > MigrationFinalized, we could directly say the state is
> > > > KRaftMode. So
> > > > > > we
> > > > > > > > can
> > > > > > > > > use the same value for default KRaft clusters.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 2) ZkMigrationReady in ApiVersionsResponse from KRaft
> > > > Controller. By
> > > > > > > > > default, we plan to start the Controller quorum in "
> > > > > > > > > *kafka.metadata.migration.enable*" config set to true. Then
> > do we
> > > > > > need
> > > > > > > > this
> > > > > > > > > additional information again to make sure The controllers are
> > > > ready
> > > > > > for
> > > > > > > > > migration? What would happen if the Controller assumes it is
> > > > ready
> > > > > > for
> > > > > > > > > migration from 3.4 by default if it doesn't see both
> > > > > > MigrationMetadata
> > > > > > > > > records?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 3) I see that we do not impose order on rolling the brokers
> > with
> > > > > > > > migration
> > > > > > > > > flags and provisioning the controller quorum. Along with the
> > > > KRaft
> > > > > > > > > controller emitting migration state metrics, it may be
> > better to
> > > > emit
> > > > > > > the
> > > > > > > > > broker count for the brokers not ready for migration yet.
> > This
> > > > will
> > > > > > > give
> > > > > > > > us
> > > > > > > > > more insight into any roll issues.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 4) Once the KRaft controller is in migration mode, we should
> > also
> > > > > > > > > prevent/handle ZkBrokerRegistrations that don't enable
> > migration
> > > > > > mode.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > > Akhilesh
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Tue, Nov 1, 2022 at 10:49 AM Jun Rao
> > <j...@confluent.io.invalid
> > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi, David,
> > > > > > > > > >
> > > > > > > > > > Thanks for the reply.
> > > > > > > > > >
> > > > > > > > > > 20/21. Regarding the new ZkMigrationReady field in
> > > > > > > ApiVersionsResponse,
> > > > > > > > > it
> > > > > > > > > > seems that this is a bit intrusive since it exposes
> > unneeded
> > > > info
> > > > > > to
> > > > > > > > the
> > > > > > > > > > clients. Another option is to add that field as part of the
> > > > Fetch
> > > > > > > > > request.
> > > > > > > > > > We can choose to only set that field in the very first
> > Fetch
> > > > > > request
> > > > > > > > > from a
> > > > > > > > > > Quorum follower.
> > > > > > > > > >
> > > > > > > > > > 40. For
> > > > kafka.controller:type=KafkaController,name=MigrationState,
> > > > > > > what
> > > > > > > > > is
> > > > > > > > > > the value for a brand new KRaft cluster?
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > > On Mon, Oct 31, 2022 at 2:35 PM David Arthur
> > > > > > > > > > <david.art...@confluent.io.invalid> wrote:
> > > > > > > > > >
> > > > > > > > > > > 30. I think we can keep the single ControllerId field in
> > > > those
> > > > > > > > requests
> > > > > > > > > > > since they are only used for fencing (as far as I know).
> > > > > > > Internally,
> > > > > > > > > the
> > > > > > > > > > > broker components that handle those requests will
> > compare the
> > > > > > > > > > ControllerId
> > > > > > > > > > > with that of MetadataCache (which is updated via UMR).
> > > > > > > > > > >
> > > > > > > > > > > The reason we need the separate KRaftControllerId in the
> > > > > > > > UpdateMetadata
> > > > > > > > > > > code path so that we can have different connection
> > behavior
> > > > for a
> > > > > > > > KRaft
> > > > > > > > > > > controller vs ZK controller.
> > > > > > > > > > >
> > > > > > > > > > > 31. It seems reasonable to keep the MigrationRecord in
> > the
> > > > > > > snapshot.
> > > > > > > > I
> > > > > > > > > > was
> > > > > > > > > > > thinking the same thing in terms of understanding the
> > loss
> > > > for a
> > > > > > > > > > > migration-after-finalization. However, once a snapshot
> > has
> > > > been
> > > > > > > taken
> > > > > > > > > > that
> > > > > > > > > > > includes the final MigrationRecord, we can't easily see
> > which
> > > > > > > records
> > > > > > > > > > came
> > > > > > > > > > > after it.
> > > > > > > > > > >
> > > > > > > > > > > 32. You're correct, we can just use the modify time from
> > the
> > > > > > Stat.
> > > > > > > > The
> > > > > > > > > > > other two fields are primarily informational and are
> > there
> > > > for
> > > > > > > > > operators
> > > > > > > > > > > who want to inspect the state of the migration. They
> > aren't
> > > > > > > required
> > > > > > > > > for
> > > > > > > > > > > correctness
> > > > > > > > > > >
> > > > > > > > > > > 33. Yes that's right. I detail that in "Controller
> > > > Leadership"
> > > > > > > > section
> > > > > > > > > > >
> > > > > > > > > > > 34. Right, I'll fix that.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > David
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Oct 31, 2022 at 2:55 PM Jun Rao
> > > > <j...@confluent.io.invalid
> > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi, David,
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for the updated KIP. A few more comments.
> > > > > > > > > > > >
> > > > > > > > > > > > 30. LeaderAndIsrRequest/StopReplicaRequest both have a
> > > > > > > controllerId
> > > > > > > > > > > field.
> > > > > > > > > > > > Should we add a KRaftControllerId field like
> > > > UpdateMetadata?
> > > > > > > > > > > >
> > > > > > > > > > > > 31. "If a migration has been finalized, but the KRaft
> > > > quroum
> > > > > > > comes
> > > > > > > > up
> > > > > > > > > > > with
> > > > > > > > > > > > kafka.metadata.migration.enable, we must not re-enter
> > the
> > > > > > > migration
> > > > > > > > > > mode.
> > > > > > > > > > > > In this case, while replaying the log, the controller
> > can
> > > > see
> > > > > > the
> > > > > > > > > > second
> > > > > > > > > > > > MigrationRecord and know that the migration is
> > finalized
> > > > and
> > > > > > > should
> > > > > > > > > not
> > > > > > > > > > > be
> > > > > > > > > > > > resumed. " Hmm, do we want to keep the MigrationRecord
> > in
> > > > the
> > > > > > > > > snapshot
> > > > > > > > > > > and
> > > > > > > > > > > > the metadata log forever after migration is finalized?
> > If
> > > > not,
> > > > > > we
> > > > > > > > > can't
> > > > > > > > > > > > know for sure whether a migration has happened or not.
> > > > Also, it
> > > > > > > > might
> > > > > > > > > > be
> > > > > > > > > > > > useful to support switching back to ZK mode after the
> > > > migration
> > > > > > > is
> > > > > > > > > > > > finalized, with the understanding of potential metadata
> > > > loss.
> > > > > > In
> > > > > > > > that
> > > > > > > > > > > case,
> > > > > > > > > > > > we could just trim all metadata log and recopy the ZK
> > > > metadata
> > > > > > > > back.
> > > > > > > > > > > >
> > > > > > > > > > > > 32. The /migration node in ZK: Do we need
> > > > last_update_time_ms
> > > > > > > since
> > > > > > > > > ZK
> > > > > > > > > > > > Stats already has an MTime? Also, how do we plan to
> > use the
> > > > > > > > > > > > kraft_controller_id and kraft_controller_epoch fields?
> > > > > > > > > > > >
> > > > > > > > > > > > 33. Controller migration: We will force a write to the
> > > > > > > > "/controller"
> > > > > > > > > > and
> > > > > > > > > > > > "/controller_epoch" ZNodes before copying ZK data,
> > right?
> > > > > > > > > > > >
> > > > > > > > > > > > 34. "Operator can remove the persistent "/controller"
> > and
> > > > > > > > > > > > "/controller_epoch" nodes allowing for ZK controller
> > > > election
> > > > > > to
> > > > > > > > take
> > > > > > > > > > > > place". I guess the operator only needs to remove the
> > > > > > /controller
> > > > > > > > > path?
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Jun
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Oct 31, 2022 at 7:17 AM David Arthur
> > > > > > > > > > > > <david.art...@confluent.io.invalid> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Happy Monday, everyone! I've updated the KIP with the
> > > > > > following
> > > > > > > > > > > changes:
> > > > > > > > > > > > >
> > > > > > > > > > > > > * Clarified MetadataType metric usages (broker vs
> > > > controller)
> > > > > > > > > > > > > * Added ZkMigrationReady tagged field to
> > > > ApiVersionsResponse
> > > > > > > (for
> > > > > > > > > use
> > > > > > > > > > > by
> > > > > > > > > > > > > KRaft controller quorum)
> > > > > > > > > > > > > * Added MigrationRecord with two states: Started and
> > > > Finished
> > > > > > > > > > > > > * Documented ZK configs for KRaft controller
> > > > > > > > > > > > > * Simplified state machine description (internally,
> > more
> > > > > > states
> > > > > > > > > will
> > > > > > > > > > > > exist,
> > > > > > > > > > > > > but only the four documented are interesting to
> > > > operators)
> > > > > > > > > > > > > * Clarified some things in Controller Migration
> > section
> > > > > > > > > > > > > * Removed KRaft -> ZK parts of Broker Registration
> > > > > > > > > > > > > * Added Misconfigurations section to Failure Modes
> > > > > > > > > > > > >
> > > > > > > > > > > > > Let me know if I've missed anything from the past two
> > > > weeks
> > > > > > of
> > > > > > > > > > > > discussion.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks again to everyone who has reviewed this KIP so
> > > > far!
> > > > > > > > > > > > >
> > > > > > > > > > > > > -David
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Oct 28, 2022 at 2:26 PM Jun Rao
> > > > > > > <j...@confluent.io.invalid
> > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi, David,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for the reply.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 20/21. Sounds good.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Could you update the doc with all the changes being
> > > > > > > discussed?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Oct 28, 2022 at 10:11 AM David Arthur
> > > > > > > > > > > > > > <david.art...@confluent.io.invalid> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 20/21. I was also wondering about a "migration"
> > > > record.
> > > > > > In
> > > > > > > > > > addition
> > > > > > > > > > > > to
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > scenario you mentioned, we also need a way to
> > > > prevent the
> > > > > > > > > cluster
> > > > > > > > > > > > from
> > > > > > > > > > > > > > > re-entering the dual write mode after the
> > migration
> > > > has
> > > > > > > been
> > > > > > > > > > > > > finalized. I
> > > > > > > > > > > > > > > could see this happening inadvertently via a
> > change
> > > > in
> > > > > > some
> > > > > > > > > > > > > configuration
> > > > > > > > > > > > > > > management system. How about we add a record that
> > > > marks
> > > > > > the
> > > > > > > > > > > beginning
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > end of the dual-write mode. The first occurrence
> > of
> > > > the
> > > > > > > > record
> > > > > > > > > > > could
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > included in the metadata transaction when we
> > migrate
> > > > data
> > > > > > > > from
> > > > > > > > > > ZK.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > With this, the active controller would decide
> > > > whether to
> > > > > > > > enter
> > > > > > > > > > dual
> > > > > > > > > > > > > write
> > > > > > > > > > > > > > > mode, finalize the migration based, or fail
> > based on:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > * Metadata log state
> > > > > > > > > > > > > > > * It's own configuration
> > > > > > > ("kafka.metadata.migration.enable",
> > > > > > > > > > > > > > > "zookeeper.connect", etc)
> > > > > > > > > > > > > > > * The other controllers configuration (via
> > > > > > > > ApiVersionsResponse)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > WDYT?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 22. Since we will need the fencing anyways as a
> > > > > > safe-guard,
> > > > > > > > > then
> > > > > > > > > > I
> > > > > > > > > > > > > agree
> > > > > > > > > > > > > > > would could skip the registration of KRaft
> > brokers
> > > > in ZK
> > > > > > to
> > > > > > > > > > simply
> > > > > > > > > > > > > > things a
> > > > > > > > > > > > > > > bit.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > David
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Thu, Oct 27, 2022 at 5:11 PM Jun Rao
> > > > > > > > > <j...@confluent.io.invalid
> > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi, David,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks for the reply.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 20/21. Relying upon the presence of ZK configs
> > to
> > > > > > > determine
> > > > > > > > > > > whether
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > KRaft controller is in a dual write mode seems
> > a
> > > > bit
> > > > > > > error
> > > > > > > > > > prone.
> > > > > > > > > > > > If
> > > > > > > > > > > > > > > > someone accidentally adds a ZK configuration
> > to a
> > > > brand
> > > > > > > new
> > > > > > > > > > KRaft
> > > > > > > > > > > > > > > cluster,
> > > > > > > > > > > > > > > > ideally it shouldn't cause the controller to
> > get
> > > > into a
> > > > > > > > weird
> > > > > > > > > > > > state.
> > > > > > > > > > > > > > Have
> > > > > > > > > > > > > > > > we considered storing the migration state in a
> > > > metadata
> > > > > > > > > record?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 22. If we have the broker fencing logic, do we
> > > > need to
> > > > > > > > write
> > > > > > > > > > the
> > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > registration path in ZK for KRaft brokers at
> > all?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Thu, Oct 27, 2022 at 1:02 PM David Arthur
> > > > > > > > > > > > > > > > <david.art...@confluent.io.invalid> wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 20/21. A KRaft controller will recover the
> > > > migration
> > > > > > > > state
> > > > > > > > > by
> > > > > > > > > > > > > reading
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > "/migration" ZNode. If the migration enable
> > > > config is
> > > > > > > > set,
> > > > > > > > > > and
> > > > > > > > > > > > the
> > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > migration is complete, it will enter the
> > > > dual-write
> > > > > > > mode.
> > > > > > > > > > > Before
> > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > operator can decommission ZK, they will need
> > to
> > > > > > > finalize
> > > > > > > > > the
> > > > > > > > > > > > > > migration
> > > > > > > > > > > > > > > > > which involves removing the migration config
> > and
> > > > the
> > > > > > ZK
> > > > > > > > > > config.
> > > > > > > > > > > > > I'll
> > > > > > > > > > > > > > > > > clarify this in the KIP.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 22. Yea, we could see an incorrect broker ID
> > > > during
> > > > > > > that
> > > > > > > > > > > window.
> > > > > > > > > > > > > If
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > ended up with a state where we saw a ZK
> > broker ID
> > > > > > that
> > > > > > > > > > > conflicted
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > KRaft broker ID, we would need to fence one
> > of
> > > > them.
> > > > > > I
> > > > > > > > > would
> > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > opt
> > > > > > > > > > > > > > > > > to fence the KRaft broker in that case since
> > > > broker
> > > > > > > > > > > registration
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > fencing is more robust in KRaft. Hopefully
> > this
> > > > is a
> > > > > > > rare
> > > > > > > > > > case.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 26. Sounds good.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks!
> > > > > > > > > > > > > > > > > David
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Thu, Oct 27, 2022 at 1:34 PM Jun Rao
> > > > > > > > > > > <j...@confluent.io.invalid
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Hi, David,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks for the reply. A few more comments.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 20/21. Using a tagged field in
> > > > ApiVersionRequest
> > > > > > > could
> > > > > > > > > > work.
> > > > > > > > > > > > > > Related
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > this, how does a KRaft controller know that
> > > > it's in
> > > > > > > the
> > > > > > > > > > dual
> > > > > > > > > > > > > write
> > > > > > > > > > > > > > > > mode?
> > > > > > > > > > > > > > > > > > Does it need to read the /controller path
> > from
> > > > ZK?
> > > > > > > > After
> > > > > > > > > > the
> > > > > > > > > > > > > > > migration,
> > > > > > > > > > > > > > > > > > people may have the ZK cluster
> > decommissioned,
> > > > but
> > > > > > > > still
> > > > > > > > > > have
> > > > > > > > > > > > the
> > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > configs left in the KRaft controller. Will
> > this
> > > > > > cause
> > > > > > > > the
> > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > to be stuck because it doesn't know which
> > mode
> > > > it
> > > > > > is
> > > > > > > > in?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 22. Using the ephemeral node matches the
> > > > current
> > > > > > > > ZK-based
> > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > behavior
> > > > > > > > > > > > > > > > > > better. However, it leaves a window for
> > > > incorrect
> > > > > > > > broker
> > > > > > > > > > > > > > registration
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > sneak in during KRaft controller failover.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 26. Then, we could just remove Broker
> > > > Registration
> > > > > > in
> > > > > > > > > that
> > > > > > > > > > > > > section.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Wed, Oct 26, 2022 at 2:21 PM David
> > Arthur
> > > > > > > > > > > > > > > > > > <david.art...@confluent.io.invalid> wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 20/21 It could definitely cause problems
> > if
> > > > we
> > > > > > > > failover
> > > > > > > > > > to
> > > > > > > > > > > a
> > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > without
> > "kafka.metadata.migration.enable".
> > > > The
> > > > > > only
> > > > > > > > > > > > mechanism I
> > > > > > > > > > > > > > > know
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > controllers to learn things about one
> > > > another is
> > > > > > > > > > > ApiVersions.
> > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > currently
> > > > > > > > > > > > > > > > > > > use this for checking support for
> > > > > > > "metadata.version"
> > > > > > > > > (in
> > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > mode).
> > > > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > > could add a "zk.migration" feature flag
> > > > that's
> > > > > > > > enabled
> > > > > > > > > > on a
> > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > if the config is set. Another possibility
> > > > would
> > > > > > be
> > > > > > > a
> > > > > > > > > > tagged
> > > > > > > > > > > > > field
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > ApiVersionResponse that indicated if the
> > > > config
> > > > > > was
> > > > > > > > > set.
> > > > > > > > > > > Both
> > > > > > > > > > > > > > seem
> > > > > > > > > > > > > > > > > > somewhat
> > > > > > > > > > > > > > > > > > > inelegant. I think a tagged field would
> > be a
> > > > bit
> > > > > > > > > simpler
> > > > > > > > > > > (and
> > > > > > > > > > > > > > > > arguably
> > > > > > > > > > > > > > > > > > less
> > > > > > > > > > > > > > > > > > > hacky).
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > For 20, we could avoid entering the
> > migration
> > > > > > state
> > > > > > > > > > unless
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > whole
> > > > > > > > > > > > > > > > > > quorum
> > > > > > > > > > > > > > > > > > > had the field present in their
> > > > NodeApiVersions.
> > > > > > For
> > > > > > > > 21,
> > > > > > > > > > we
> > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > > > > leaving the migration state unless the
> > whole
> > > > > > quorum
> > > > > > > > did
> > > > > > > > > > not
> > > > > > > > > > > > > have
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > > > > > in their NodeApiVersions. Do you think
> > this
> > > > would
> > > > > > > be
> > > > > > > > > > > > > sufficient?
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 22. Right, we need to write the broker
> > info
> > > > back
> > > > > > to
> > > > > > > > ZK
> > > > > > > > > > just
> > > > > > > > > > > > as
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > safeguard
> > > > > > > > > > > > > > > > > > > against incorrect broker IDs getting
> > > > registered
> > > > > > > into
> > > > > > > > > ZK.
> > > > > > > > > > I
> > > > > > > > > > > > was
> > > > > > > > > > > > > > > > thinking
> > > > > > > > > > > > > > > > > > > these would be persistent nodes, but it's
> > > > > > probably
> > > > > > > > fine
> > > > > > > > > > to
> > > > > > > > > > > > make
> > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > > > > ephemeral and have the active KRaft
> > > > controller
> > > > > > keep
> > > > > > > > > them
> > > > > > > > > > up
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > date.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 23. Right. When the broker comes up as a
> > > > KRaft
> > > > > > > > broker,
> > > > > > > > > > it's
> > > > > > > > > > > > old
> > > > > > > > > > > > > > > > > > > /brokers/ids ZNode will be gone and it
> > will
> > > > > > > register
> > > > > > > > > > itself
> > > > > > > > > > > > > with
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > controller. The controller will know
> > that it
> > > > is
> > > > > > now
> > > > > > > > in
> > > > > > > > > > > KRaft
> > > > > > > > > > > > > mode
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > stop sending it the old RPCs.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 24. Ok, I'll add these
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 25. I realize now the "/controller_epoch"
> > > > node is
> > > > > > > > > already
> > > > > > > > > > > > > > > persistent.
> > > > > > > > > > > > > > > > > It
> > > > > > > > > > > > > > > > > > > should be sufficient to remove the
> > > > "/controller"
> > > > > > > node
> > > > > > > > > to
> > > > > > > > > > > > > trigger
> > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > election.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 26. Hmm, not sure, but I don't think it
> > uses
> > > > > > > watches.
> > > > > > > > > > > > > KafkaServer
> > > > > > > > > > > > > > > > > > registers
> > > > > > > > > > > > > > > > > > > the broker info and later loads all
> > brokers
> > > > in
> > > > > > the
> > > > > > > > > > cluster
> > > > > > > > > > > to
> > > > > > > > > > > > > > check
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > the endpoints don't conflict. Things
> > that do
> > > > use
> > > > > > > > > watches
> > > > > > > > > > > are
> > > > > > > > > > > > > > > dynamic
> > > > > > > > > > > > > > > > > > > configs, ACLs, and some others (listed
> > in the
> > > > > > KIP).
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Wed, Oct 26, 2022 at 4:26 PM David
> > Arthur
> > > > <
> > > > > > > > > > > > > > > > > david.art...@confluent.io>
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Luke and Andrew, thanks for taking a
> > look!
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I think the names of the state machine
> > in
> > > > the
> > > > > > KIP
> > > > > > > > > > aren't
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > best.
> > > > > > > > > > > > > > > > > I'll
> > > > > > > > > > > > > > > > > > > > try to improve that section in general.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 1. "MigrationReady" is probably better
> > > > named
> > > > > > > > > > > > > "MigratingFromZk"
> > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > > something. It's meant to be the state
> > when
> > > > the
> > > > > > > > KRaft
> > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > actively
> > > > > > > > > > > > > > > > > > > > migrating the data out of ZK. This
> > happens
> > > > > > after
> > > > > > > we
> > > > > > > > > > have
> > > > > > > > > > > > > > detected
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > cluster is eligible and before we
> > enter the
> > > > > > dual
> > > > > > > > > write
> > > > > > > > > > > > mode.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 2. "MigrationActive" should probably be
> > > > called
> > > > > > > > > > > > > > "BrokerMigration".
> > > > > > > > > > > > > > > > > > > > Transitioning to "MigrationFinished"
> > can
> > > > happen
> > > > > > > > > > > > automatically
> > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > known ZK brokers have been migrated.
> > Since
> > > > we
> > > > > > > don't
> > > > > > > > > > have
> > > > > > > > > > > > > > > permanent
> > > > > > > > > > > > > > > > > > > > registrations of ZK brokers, we can
> > use the
> > > > > > > > partition
> > > > > > > > > > > > > > assignments
> > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > proxy for this.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 3. A metric for unready brokers makes
> > > > sense. We
> > > > > > > can
> > > > > > > > > > also
> > > > > > > > > > > > log
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > message
> > > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > the controller when it tries to start
> > the
> > > > > > > migration
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 4. The MetadataType metric is meant to
> > > > help see
> > > > > > > > this.
> > > > > > > > > > > E.g.,
> > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > brokers
> > > > > > > > > > > > > > > > > > > > would have MetadataType=ZK, some would
> > have
> > > > > > > > > > > > MetadataType=Dual
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 5. On ZK brokers, not having this
> > config
> > > > set
> > > > > > > would
> > > > > > > > > > > prevent
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > > > registration data from being written
> > to ZK.
> > > > > > This
> > > > > > > > > means
> > > > > > > > > > > the
> > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > > won't send RPCs to it. If a broker has
> > been
> > > > > > > > migrated
> > > > > > > > > to
> > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > already,
> > > > > > > > > > > > > > > > > > > I'm
> > > > > > > > > > > > > > > > > > > > not sure there is any harm in removing
> > this
> > > > > > > config.
> > > > > > > > > If
> > > > > > > > > > we
> > > > > > > > > > > > > > decide
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > need to guarantee that KRaft brokers
> > have
> > > > this
> > > > > > > > config
> > > > > > > > > > set
> > > > > > > > > > > > > > during
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > migration, we can include it in the
> > broker
> > > > > > > > > registration
> > > > > > > > > > > > > that's
> > > > > > > > > > > > > > > sent
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > KRaft. This would let the controller
> > keep
> > > > that
> > > > > > > > broker
> > > > > > > > > > > > fenced.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 6. Once the KRaft controller takes
> > > > leadership,
> > > > > > > the
> > > > > > > > ZK
> > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > won't
> > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > active any more and will stop
> > reporting the
> > > > > > > > > > > MigrationState
> > > > > > > > > > > > > > > metric.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 7. The "Dual" MetadataType is reported
> > by
> > > > > > brokers
> > > > > > > > > > running
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > mode
> > > > > > > > > > > > > > > > > > > > when the migration enable config is
> > set. I
> > > > > > think
> > > > > > > > the
> > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > > also
> > > > > > > > > > > > > > > > > > > > report this, not just the brokers. I'll
> > > > clarify
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > KIP.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Andrew:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > How will the code get all the
> > ZooKeeper
> > > > > > config?
> > > > > > > > > Will
> > > > > > > > > > it
> > > > > > > > > > > > do
> > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > sort
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > scan of the ZooKeeper data store? ...
> > Do we
> > > > > > need
> > > > > > > to
> > > > > > > > > do
> > > > > > > > > > > > > anything
> > > > > > > > > > > > > > > > > special
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > make sure we get all data from
> > ZooKeeper?
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > With a few exceptions, all data
> > written to
> > > > ZK
> > > > > > by
> > > > > > > > > Kafka
> > > > > > > > > > > > > happens
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > controller (single writer principle).
> > In
> > > > our
> > > > > > > > > migration
> > > > > > > > > > > > code,
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > > unconditionally update the
> > "/controller"
> > > > and
> > > > > > > > > > > > > > "/controller_epoch"
> > > > > > > > > > > > > > > > > ZNodes
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > effectively force the migration
> > component
> > > > to
> > > > > > gain
> > > > > > > > > > > > leadership
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > > controller. Once this happens, we don't
> > > > expect
> > > > > > > any
> > > > > > > > > data
> > > > > > > > > > > to
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > > written
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > ZK, so we can read it iteratively
> > without
> > > > > > > worrying
> > > > > > > > > > about
> > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > concurrent
> > > > > > > > > > > > > > > > > > > > writes.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > As for the linearizable thing, I
> > believe
> > > > this
> > > > > > > just
> > > > > > > > > > means
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > reads
> > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > served by a quorum follower which has
> > stale
> > > > > > data.
> > > > > > > > > We'll
> > > > > > > > > > > be
> > > > > > > > > > > > > > > reading
> > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > > the same way the ZK controller does,
> > so I
> > > > think
> > > > > > > we
> > > > > > > > > will
> > > > > > > > > > > be
> > > > > > > > > > > > > fine
> > > > > > > > > > > > > > > > > > regarding
> > > > > > > > > > > > > > > > > > > > consistency. If we wanted to be extra
> > > > careful,
> > > > > > we
> > > > > > > > > could
> > > > > > > > > > > > add a
> > > > > > > > > > > > > > > delay
> > > > > > > > > > > > > > > > > > prior
> > > > > > > > > > > > > > > > > > > > to iterating through the znodes to give
> > > > > > > partitioned
> > > > > > > > > ZK
> > > > > > > > > > > > > > followers
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > chance
> > > > > > > > > > > > > > > > > > > > to get kicked out of the quorum. I
> > don't
> > > > think
> > > > > > > > we'll
> > > > > > > > > > need
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > though
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > What happens if we commit the
> > transaction
> > > > > > then
> > > > > > > > fail
> > > > > > > > > > > right
> > > > > > > > > > > > > > after
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > restart?
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > If we commit the "migration"
> > transaction
> > > > to the
> > > > > > > > > > metadata
> > > > > > > > > > > > log,
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > won't
> > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > a problem if we failover or restart.
> > We can
> > > > > > > recover
> > > > > > > > > our
> > > > > > > > > > > > state
> > > > > > > > > > > > > > > based
> > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > metadata log and what exists in ZK. If
> > we
> > > > see a
> > > > > > > > > > migration
> > > > > > > > > > > > > > > > transaction
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > the kraft log, but no "/migration"
> > ZNode,
> > > > we'll
> > > > > > > > know
> > > > > > > > > we
> > > > > > > > > > > > > failed
> > > > > > > > > > > > > > > > before
> > > > > > > > > > > > > > > > > > > > writing to ZK. If we see a partial
> > > > transaction
> > > > > > in
> > > > > > > > the
> > > > > > > > > > > log,
> > > > > > > > > > > > > then
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > > abort it and restart the migration.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > This sort of leads to me wondering
> > if we
> > > > > > > > > will/should
> > > > > > > > > > > > check
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > metadata
> > > > > > > > > > > > > > > > > > > > log state before doing the migration?
> > That
> > > > > > could
> > > > > > > > also
> > > > > > > > > > > catch
> > > > > > > > > > > > > > > > mistakes
> > > > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > > as if a KRaft quorum is used that
> > already
> > > > had
> > > > > > > some
> > > > > > > > > > > metadata
> > > > > > > > > > > > > > > which I
> > > > > > > > > > > > > > > > > > > assume
> > > > > > > > > > > > > > > > > > > > we want to prevent.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Yes, we should check that the metadata
> > log
> > > > is
> > > > > > > empty
> > > > > > > > > > > before
> > > > > > > > > > > > > > > > > attempting a
> > > > > > > > > > > > > > > > > > > > migration (perhaps excepting the
> > bootstrap
> > > > > > > metadata
> > > > > > > > > --
> > > > > > > > > > > need
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > that one still).
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Mon, Oct 24, 2022 at 8:57 AM Andrew
> > > > Grant
> > > > > > > > > > > > > > > > > > <agr...@confluent.io.invalid
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> Hey David,
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> Thanks for the KIP. I had a few small
> > > > > > questions.
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> “The ZK data migration will copy the
> > > > existing
> > > > > > ZK
> > > > > > > > > data
> > > > > > > > > > > into
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > >> metadata log and establish the new
> > KRaft
> > > > > > active
> > > > > > > > > > > controller
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > active
> > > > > > > > > > > > > > > > > > > >> controller from a ZK perspective.”
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> How will the code get all the
> > ZooKeeper
> > > > > > config?
> > > > > > > > Will
> > > > > > > > > > it
> > > > > > > > > > > do
> > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > sort
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > >> scan of the ZooKeeper data store?
> > Also I’m
> > > > > > not a
> > > > > > > > > > > ZooKeeper
> > > > > > > > > > > > > > > expert
> > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > per
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > >
> > > > > > > > >
> > https://zookeeper.apache.org/doc/current/zookeeperInternals.html
> > > > > > > > > > > > > > > > > > “Read
> > > > > > > > > > > > > > > > > > > >> operations in ZooKeeper are *not
> > > > linearizable*
> > > > > > > > since
> > > > > > > > > > > they
> > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > return
> > > > > > > > > > > > > > > > > > > >> potentially stale data.” Do we need
> > to do
> > > > > > > anything
> > > > > > > > > > > special
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > make
> > > > > > > > > > > > > > > > > > sure
> > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > >> get all data from ZooKeeper?
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> “For the initial migration, the
> > controller
> > > > > > will
> > > > > > > > > > utilize
> > > > > > > > > > > > > > KIP-868
> > > > > > > > > > > > > > > > > > Metadata
> > > > > > > > > > > > > > > > > > > >> Transactions
> > > > > > > > > > > > > > > > > > > >> <
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-868+Metadata+Transactions
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > >> write all of the ZK metadata in a
> > single
> > > > > > > > > transaction.
> > > > > > > > > > If
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > >> fails before this transaction is
> > > > finalized,
> > > > > > the
> > > > > > > > next
> > > > > > > > > > > > active
> > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > >> will
> > > > > > > > > > > > > > > > > > > >> abort the transaction and restart the
> > > > > > migration
> > > > > > > > > > > process.”
> > > > > > > > > > > > > What
> > > > > > > > > > > > > > > > > happens
> > > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > >> we commit the transaction then fail
> > right
> > > > > > after
> > > > > > > > that
> > > > > > > > > > and
> > > > > > > > > > > > > > > restart?
> > > > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > > > >> sort
> > > > > > > > > > > > > > > > > > > >> of leads to me wondering if we
> > will/should
> > > > > > check
> > > > > > > > the
> > > > > > > > > > > > > metadata
> > > > > > > > > > > > > > > log
> > > > > > > > > > > > > > > > > > state
> > > > > > > > > > > > > > > > > > > >> before doing the migration? That could
> > > > also
> > > > > > > catch
> > > > > > > > > > > mistakes
> > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > >> KRaft quorum is used that already had
> > some
> > > > > > > > metadata
> > > > > > > > > > > which
> > > > > > > > > > > > I
> > > > > > > > > > > > > > > assume
> > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > want
> > > > > > > > > > > > > > > > > > > >> to prevent.
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> For the Test Plan section do you think
> > > > it’s
> > > > > > > worth
> > > > > > > > > > > calling
> > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > > > performance
> > > > > > > > > > > > > > > > > > > >> testing migrations of ZooKeeper
> > > > deployments
> > > > > > that
> > > > > > > > > have
> > > > > > > > > > > > > “large”
> > > > > > > > > > > > > > > > > amounts
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > >> metadata?
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> Thanks,
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> Andrew
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> On Mon, Oct 24, 2022 at 3:20 AM Luke
> > Chen
> > > > <
> > > > > > > > > > > > > show...@gmail.com>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> > Hi David,
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > Thanks for the good and complicated
> > > > > > proposal!
> > > > > > > :)
> > > > > > > > > > > > > > > > > > > >> > Some questions:
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > 1. "MigrationReady" state: The KIP
> > said:
> > > > > > > > > > > > > > > > > > > >> > The KRaft quorum has been started
> > > > > > > > > > > > > > > > > > > >> > I don't think we'll automatically
> > enter
> > > > this
> > > > > > > > state
> > > > > > > > > > > after
> > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > quorum
> > > > > > > > > > > > > > > > > > > >> > started, do we?
> > > > > > > > > > > > > > > > > > > >> > I think KRaft active controller
> > should
> > > > take
> > > > > > > over
> > > > > > > > > > > > > leadership
> > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > > writing a
> > > > > > > > > > > > > > > > > > > >> > znode in /controller path before we
> > > > entering
> > > > > > > > this
> > > > > > > > > > > sate,
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > correct?
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > 2. "MigrationActive" state: the KIP
> > > > said:
> > > > > > > > > > > > > > > > > > > >> > ZK state has been migrated,
> > controller
> > > > is in
> > > > > > > > > > > dual-write
> > > > > > > > > > > > > > mode,
> > > > > > > > > > > > > > > > > > brokers
> > > > > > > > > > > > > > > > > > > >> are
> > > > > > > > > > > > > > > > > > > >> > being restarted in KRaft mode
> > > > > > > > > > > > > > > > > > > >> > What confuses me is the last part:
> > > > "brokers
> > > > > > > are
> > > > > > > > > > being
> > > > > > > > > > > > > > > restarted
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > >> > mode".
> > > > > > > > > > > > > > > > > > > >> > How could we detect brokers are
> > being
> > > > > > > restarted
> > > > > > > > in
> > > > > > > > > > > KRaft
> > > > > > > > > > > > > > mode?
> > > > > > > > > > > > > > > > Old
> > > > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > >> > broker is removed and new KRaft
> > broker
> > > > is up
> > > > > > > > > within
> > > > > > > > > > N
> > > > > > > > > > > > > > minutes?
> > > > > > > > > > > > > > > > > > > >> > I think we don't have to rely on the
> > > > > > condition
> > > > > > > > > > > "brokers
> > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > being
> > > > > > > > > > > > > > > > > > > >> restarted
> > > > > > > > > > > > > > > > > > > >> > in KRaft mode" to enter this state.
> > > > > > > > > > > > > > > > > > > >> > "brokers are being restarted in
> > KRaft
> > > > mode"
> > > > > > > > should
> > > > > > > > > > be
> > > > > > > > > > > a
> > > > > > > > > > > > > > > process
> > > > > > > > > > > > > > > > > > > happened
> > > > > > > > > > > > > > > > > > > >> > between "MigrationActive" and
> > > > > > > > "MigrationFinished".
> > > > > > > > > > > Does
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > make
> > > > > > > > > > > > > > > > > > > sense?
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > 3. When "Zookeeper" mode trying to
> > enter
> > > > > > > > > > > > > > "MigrationEligible",
> > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > > > > >> is a
> > > > > > > > > > > > > > > > > > > >> > cluster with tens of brokers, how
> > could
> > > > > > users
> > > > > > > > know
> > > > > > > > > > why
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > cluster
> > > > > > > > > > > > > > > > > > > >> cannot
> > > > > > > > > > > > > > > > > > > >> > be in "MigrationEligible" state,
> > yet?
> > > > Check
> > > > > > > that
> > > > > > > > > > znode
> > > > > > > > > > > > > > > manually
> > > > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > > >> one?
> > > > > > > > > > > > > > > > > > > >> > Or do we plan to create a tool to
> > help
> > > > them?
> > > > > > > Or
> > > > > > > > > > maybe
> > > > > > > > > > > > > expose
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> "unready
> > > > > > > > > > > > > > > > > > > >> > ZK brokers" metrics?
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > 4. Same for "MigrationActive"
> > entering
> > > > > > > > > > > > "MigrationFinished"
> > > > > > > > > > > > > > > > state.
> > > > > > > > > > > > > > > > > > > Since
> > > > > > > > > > > > > > > > > > > >> > that will be some process for
> > > > restarting ZK
> > > > > > > > broker
> > > > > > > > > > > into
> > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > > >> one by
> > > > > > > > > > > > > > > > > > > >> > one, could we expose the "remaining
> > ZK
> > > > > > > brokers"
> > > > > > > > as
> > > > > > > > > > > > > metrics?
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > 5. When users are in
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > "MigrationReady"/"MigrationActive"/"MigrationFinished"
> > > > > > > > > > > > > > > > > > > >> > states, and they accidentally
> > change the
> > > > > > KRaft
> > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > config:
> > > > > > > > > > > > > > > > > > > >> > "kafka.metadata.migration.enable"
> > > > > > > > > > > > > > > > > > > >> > to false. What will happen to this
> > > > cluster?
> > > > > > No
> > > > > > > > > > > > dual-write
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > it?
> > > > > > > > > > > > > > > > > > Will
> > > > > > > > > > > > > > > > > > > >> we
> > > > > > > > > > > > > > > > > > > >> > have any protection for it?
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > 6. About the "MigrationState"
> > metric:
> > > > > > > > > > > > > > > > > > > >> > The "ZooKeeper" and
> > "MigrationEligible"
> > > > is
> > > > > > > > > reported
> > > > > > > > > > by
> > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > controller,
> > > > > > > > > > > > > > > > > > > >> and
> > > > > > > > > > > > > > > > > > > >> > the rest of states are reported by
> > KRaft
> > > > > > > > > controller
> > > > > > > > > > > -->
> > > > > > > > > > > > > > makes
> > > > > > > > > > > > > > > > > sense
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> me.
> > > > > > > > > > > > > > > > > > > >> > One question from it is, when KRaft
> > > > > > controller
> > > > > > > > > takes
> > > > > > > > > > > > over
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > leadership
> > > > > > > > > > > > > > > > > > > >> > from ZK controller, what will the
> > > > > > > > "MigrationState"
> > > > > > > > > > > value
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > old
> > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > >> > controller? keep in
> > "MigrationEligible"
> > > > > > > doesn't
> > > > > > > > > make
> > > > > > > > > > > > > sense.
> > > > > > > > > > > > > > > Will
> > > > > > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > > > > >> be a
> > > > > > > > > > > > > > > > > > > >> > empty or null state?
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > 7. About the "MetadataType" metric:
> > > > > > > > > > > > > > > > > > > >> > An enumeration of: ZooKeeper (1),
> > Dual
> > > > (2),
> > > > > > > > KRaft
> > > > > > > > > > (3).
> > > > > > > > > > > > > Each
> > > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > > >> reports
> > > > > > > > > > > > > > > > > > > >> > this.
> > > > > > > > > > > > > > > > > > > >> > I don't know how we could map the
> > > > migration
> > > > > > > > state
> > > > > > > > > to
> > > > > > > > > > > > > these 3
> > > > > > > > > > > > > > > > > types.
> > > > > > > > > > > > > > > > > > > >> > What is the metadataType when
> > cluster in
> > > > > > > > > > > > "MigrationReady"
> > > > > > > > > > > > > > > state?
> > > > > > > > > > > > > > > > > > Still
> > > > > > > > > > > > > > > > > > > >> > Zookeeper?
> > > > > > > > > > > > > > > > > > > >> > When will brokers enter Dual type?
> > > > > > > > > > > > > > > > > > > >> > This is unclear in the KIP.
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > Thank you.
> > > > > > > > > > > > > > > > > > > >> > Luke
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > On Thu, Oct 20, 2022 at 11:33 PM
> > David
> > > > > > Arthur
> > > > > > > > > > > > > > > > > > > >> > <david.art...@confluent.io.invalid>
> > > > wrote:
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > > Igor, thanks for taking a look!
> > Since
> > > > JBOD
> > > > > > > in
> > > > > > > > > > KRaft
> > > > > > > > > > > is
> > > > > > > > > > > > > > still
> > > > > > > > > > > > > > > > > under
> > > > > > > > > > > > > > > > > > > >> > > discussion and not likely to land
> > > > before
> > > > > > the
> > > > > > > > ZK
> > > > > > > > > > > > > > migration, I
> > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > >> we'll
> > > > > > > > > > > > > > > > > > > >> > > need to defer it. For migrating
> > JBOD
> > > > > > > clusters
> > > > > > > > > from
> > > > > > > > > > > ZK
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > KRaft,
> > > > > > > > > > > > > > > > > > > we'll
> > > > > > > > > > > > > > > > > > > >> > also
> > > > > > > > > > > > > > > > > > > >> > > need to address the log dir
> > failure
> > > > > > > mechanism
> > > > > > > > > > which
> > > > > > > > > > > > > > > currently
> > > > > > > > > > > > > > > > > > uses a
> > > > > > > > > > > > > > > > > > > >> > > special ZNode written to by the
> > > > brokers.
> > > > > > > There
> > > > > > > > > is
> > > > > > > > > > an
> > > > > > > > > > > > old
> > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-589+Add+API+to+update+Replica+state+in+Controller
> > > > > > > > > > > > > > > > > > > >> > > which proposes a new RPC to move
> > that
> > > > ZK
> > > > > > > write
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > controller,
> > > > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > >> I'm
> > > > > > > > > > > > > > > > > > > >> > > not sure if that's the approach
> > we'll
> > > > want
> > > > > > > to
> > > > > > > > > > take.
> > > > > > > > > > > I
> > > > > > > > > > > > > read
> > > > > > > > > > > > > > > > > through
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > > discussion over in KIP-858 and it
> > > > sounds
> > > > > > > like
> > > > > > > > > > there
> > > > > > > > > > > > are
> > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > > > >> ideas
> > > > > > > > > > > > > > > > > > > >> > > there.
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > > To answer your question more
> > directly,
> > > > > > > > migrating
> > > > > > > > > > ZK
> > > > > > > > > > > > > > clusters
> > > > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > > > > >> JBOD
> > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > >> > > out of scope for this KIP. It
> > might be
> > > > > > > > possible
> > > > > > > > > to
> > > > > > > > > > > > stop
> > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > > JBOD
> > > > > > > > > > > > > > > > > > > >> using
> > > > > > > > > > > > > > > > > > > >> > > reassignments, but I'm not sure
> > about
> > > > > > that.
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > > -David
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > > On Tue, Oct 18, 2022 at 12:17 PM
> > Igor
> > > > > > > Soarez <
> > > > > > > > > > > > > i...@soarez.me
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > > > Hi David,
> > > > > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > > > > >> > > > Thanks for the KIP, this is very
> > > > > > exciting!
> > > > > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > > > > >> > > > How does JBOD relate to this
> > work?
> > > > KRaft
> > > > > > > > mode
> > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > yet
> > > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > >> > > > configuring brokers with
> > multiple
> > > > log
> > > > > > > > > > directories.
> > > > > > > > > > > > If
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > brokers
> > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > >> > > > existing cluster are configured
> > with
> > > > > > > > multiple
> > > > > > > > > > log
> > > > > > > > > > > > > dirs,
> > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > > migration
> > > > > > > > > > > > > > > > > > > >> > > > imply that the existing brokers
> > > > need to
> > > > > > > drop
> > > > > > > > > use
> > > > > > > > > > > of
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > feature?
> > > > > > > > > > > > > > > > > > > >> Or is
> > > > > > > > > > > > > > > > > > > >> > > > there some way to upgrade them
> > > > later?
> > > > > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > > > > >> > > > Thanks,
> > > > > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > > > > >> > > > --
> > > > > > > > > > > > > > > > > > > >> > > > Igor
> > > > > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > > > > >> > > > On Mon, Oct 17, 2022, at 10:07
> > PM,
> > > > David
> > > > > > > > > Arthur
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >> > > > > I've updated the KIP with the
> > > > > > following
> > > > > > > > > > changes
> > > > > > > > > > > > (the
> > > > > > > > > > > > > > > > > > confluence
> > > > > > > > > > > > > > > > > > > >> diff
> > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > >> > > > not
> > > > > > > > > > > > > > > > > > > >> > > > > helpful here since i
> > rearranged
> > > > some
> > > > > > > > things)
> > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > > > > * Added
> > > > ZooKeeperBlockingKRaftMillis
> > > > > > > > metric
> > > > > > > > > > > > > > > > > > > >> > > > > * Added section on new broker
> > > > > > > registration
> > > > > > > > > > JSON
> > > > > > > > > > > > > > > > > > > >> > > > > * Removed section on
> > > > MigrationCheck
> > > > > > RPC
> > > > > > > > > > > > > > > > > > > >> > > > > * Added change to
> > > > > > UpdateMetadataRequest
> > > > > > > > > > > > > > > > > > > >> > > > > * Added section "Additional ZK
> > > > Broker
> > > > > > > > > Configs"
> > > > > > > > > > > > > > (includes
> > > > > > > > > > > > > > > > > > configs
> > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > >> > > > connect
> > > > > > > > > > > > > > > > > > > >> > > > > to KRaft quorum)
> > > > > > > > > > > > > > > > > > > >> > > > > * Added section on
> > "Incompatible
> > > > > > > Brokers"
> > > > > > > > > > under
> > > > > > > > > > > > > > Failure
> > > > > > > > > > > > > > > > > Modes
> > > > > > > > > > > > > > > > > > > >> > > > > * Clarified many things per
> > this
> > > > > > > > discussion
> > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > > > > I realized we need the KRaft
> > > > > > controller
> > > > > > > to
> > > > > > > > > > pick
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > correct
> > > > > > > > > > > > > > > > > > > >> > > > > "metadata.version" when
> > > > initializing
> > > > > > the
> > > > > > > > > > > > migration.
> > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > included
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > IBP
> > > > > > > > > > > > > > > > > > > >> > > > of a
> > > > > > > > > > > > > > > > > > > >> > > > > broker in its registration
> > data
> > > > so the
> > > > > > > > KRaft
> > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > verify
> > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > >> > > > IBP
> > > > > > > > > > > > > > > > > > > >> > > > > and pick the correct
> > > > > > "metadata.version"
> > > > > > > > when
> > > > > > > > > > > > > starting
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> migration.
> > > > > > > > > > > > > > > > > > > >> > > > > Otherwise, we could
> > inadvertently
> > > > > > > > downgrade
> > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> IBP/metadata.version
> > > > > > > > > > > > > > > > > > > >> > as
> > > > > > > > > > > > > > > > > > > >> > > > > part of the migration.
> > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > > > > I also added "clusterId" to
> > the
> > > > broker
> > > > > > > > > > > > registration
> > > > > > > > > > > > > > data
> > > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> KRaft
> > > > > > > > > > > > > > > > > > > >> > > > could
> > > > > > > > > > > > > > > > > > > >> > > > > verify it.
> > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > > > > -David
> > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > > > > On Mon, Oct 17, 2022 at 12:14
> > PM
> > > > Colin
> > > > > > > > > McCabe
> > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > cmcc...@apache.org
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > > > wrote:
> > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > > > >> On Fri, Oct 14, 2022, at
> > 11:39,
> > > > Jun
> > > > > > Rao
> > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >> > > > >> > Hi, Colin,
> > > > > > > > > > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> > 10. "That all goes away in
> > the
> > > > new
> > > > > > > > mode,
> > > > > > > > > > and
> > > > > > > > > > > we
> > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > >> > code
> > > > > > > > > > > > > > > > > > > >> > > > which
> > > > > > > > > > > > > > > > > > > >> > > > >> > analyzes
> > __cluster_metadata and
> > > > > > > > reflects
> > > > > > > > > it
> > > > > > > > > > > in
> > > > > > > > > > > > 1)
> > > > > > > > > > > > > > > > updates
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > > > >> > > 2)
> > > > > > > > > > > > > > > > > > > >> > > > >> > messages sent out to
> > brokers."
> > > > > > > > > > > > > > > > > > > >> > > > >> > Hmm, I am not sure it's
> > that
> > > > > > simple.
> > > > > > > > Some
> > > > > > > > > > of
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > complexity
> > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > >> > > > >> ZK-based
> > > > > > > > > > > > > > > > > > > >> > > > >> > controller are (1) using
> > the
> > > > > > > pipelining
> > > > > > > > > > > > approach
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > write
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > >> > for
> > > > > > > > > > > > > > > > > > > >> > > > >> better
> > > > > > > > > > > > > > > > > > > >> > > > >> > throughput and using
> > > > conditional
> > > > > > > writes
> > > > > > > > > for
> > > > > > > > > > > > > > > > correctness;
> > > > > > > > > > > > > > > > > > (2)
> > > > > > > > > > > > > > > > > > > >> > sending
> > > > > > > > > > > > > > > > > > > >> > > > the
> > > > > > > > > > > > > > > > > > > >> > > > >> > proper LeaderAndIsr and
> > > > > > > UpdateMetadata
> > > > > > > > > > > > requests.
> > > > > > > > > > > > > > For
> > > > > > > > > > > > > > > > > > example,
> > > > > > > > > > > > > > > > > > > >> > during
> > > > > > > > > > > > > > > > > > > >> > > > >> > controller failover, the
> > full
> > > > > > > metadata
> > > > > > > > > > needs
> > > > > > > > > > > to
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > sent
> > > > > > > > > > > > > > > > > > while
> > > > > > > > > > > > > > > > > > > >> > during
> > > > > > > > > > > > > > > > > > > >> > > > >> > individual broker failure,
> > only
> > > > > > some
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > > > > metadata
> > > > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > > > >> > > > updated.
> > > > > > > > > > > > > > > > > > > >> > > > >> > The controlled shutdown
> > > > handling
> > > > > > > > > sometimes
> > > > > > > > > > > uses
> > > > > > > > > > > > > > > > > > > >> StopReplicaRequest
> > > > > > > > > > > > > > > > > > > >> > > > and
> > > > > > > > > > > > > > > > > > > >> > > > >> > some other times uses
> > > > > > > > > LeaderAndIsrRequest.
> > > > > > > > > > > (3)
> > > > > > > > > > > > > > > > triggering
> > > > > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > > >> > events
> > > > > > > > > > > > > > > > > > > >> > > > >> based
> > > > > > > > > > > > > > > > > > > >> > > > >> > on the responses of
> > > > LeaderAndIsr
> > > > > > > (e.g.
> > > > > > > > > for
> > > > > > > > > > > > topic
> > > > > > > > > > > > > > > > > deletion).
> > > > > > > > > > > > > > > > > > > >> Some
> > > > > > > > > > > > > > > > > > > >> > of
> > > > > > > > > > > > > > > > > > > >> > > > those
> > > > > > > > > > > > > > > > > > > >> > > > >> > complexity could be
> > > > re-implemented
> > > > > > > in a
> > > > > > > > > > more
> > > > > > > > > > > > > > > efficient
> > > > > > > > > > > > > > > > > way,
> > > > > > > > > > > > > > > > > > > >> but we
> > > > > > > > > > > > > > > > > > > >> > > > need
> > > > > > > > > > > > > > > > > > > >> > > > >> to
> > > > > > > > > > > > > > > > > > > >> > > > >> > be really careful not to
> > > > generate
> > > > > > > > > > regression.
> > > > > > > > > > > > > Some
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > >> > > > >> complexity
> > > > > > > > > > > > > > > > > > > >> > > > >> > just won't go away.
> > > > Reimplementing
> > > > > > > all
> > > > > > > > > > those
> > > > > > > > > > > > > logic
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > 30
> > > > > > > > > > > > > > > > > > > >> or
> > > > > > > > > > > > > > > > > > > >> > so
> > > > > > > > > > > > > > > > > > > >> > > > >> events
> > > > > > > > > > > > > > > > > > > >> > > > >> > in the ZK-based controller
> > is
> > > > > > > possible,
> > > > > > > > > but
> > > > > > > > > > > > > seems a
> > > > > > > > > > > > > > > bit
> > > > > > > > > > > > > > > > > > > >> daunting
> > > > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > > > >> > > > >> risky.
> > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > >> > > > >> Hi Jun,
> > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > >> > > > >> Thanks for the response.  I
> > agree
> > > > > > that
> > > > > > > > > there
> > > > > > > > > > is
> > > > > > > > > > > > > some
> > > > > > > > > > > > > > > work
> > > > > > > > > > > > > > > > > > here
> > > > > > > > > > > > > > > > > > > >> but I
> > > > > > > > > > > > > > > > > > > >> > > > don't
> > > > > > > > > > > > > > > > > > > >> > > > >> think it's as bad as it might
> > > > seem.
> > > > > > > Most
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > > > > code
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > writing
> > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > >> > ZK
> > > > > > > > > > > > > > > > > > > >> > > > can
> > > > > > > > > > > > > > > > > > > >> > > > >> be reused from the old
> > > > controller. We
> > > > > > > > > should
> > > > > > > > > > at
> > > > > > > > > > > > > least
> > > > > > > > > > > > > > > > > > evaluate
> > > > > > > > > > > > > > > > > > > >> using
> > > > > > > > > > > > > > > > > > > >> > > the
> > > > > > > > > > > > > > > > > > > >> > > > >> old ControllerChannelManager
> > as
> > > > > > well. I
> > > > > > > > > don't
> > > > > > > > > > > > know
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > we'll
> > > > > > > > > > > > > > > > > > end
> > > > > > > > > > > > > > > > > > > >> up
> > > > > > > > > > > > > > > > > > > >> > > > doing it
> > > > > > > > > > > > > > > > > > > >> > > > >> or not but it's a
> > possibility.
> > > > (The
> > > > > > > main
> > > > > > > > > > reason
> > > > > > > > > > > > to
> > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > >> that
> > > > > > > > > > > > > > > > > > > >> > > the
> > > > > > > > > > > > > > > > > > > >> > > > >> response handling will be a
> > bit
> > > > > > > > different)
> > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > >> > > > >> The question of what to do
> > > > during a
> > > > > > > > > > controller
> > > > > > > > > > > > > > failover
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > >> > > > interesting
> > > > > > > > > > > > > > > > > > > >> > > > >> one. Technically, we should
> > be
> > > > able
> > > > > > to
> > > > > > > > > > continue
> > > > > > > > > > > > > > sending
> > > > > > > > > > > > > > > > > > > >> incremental
> > > > > > > > > > > > > > > > > > > >> > > > updates
> > > > > > > > > > > > > > > > > > > >> > > > >> to the legacy nodes, for the
> > same
> > > > > > > reason
> > > > > > > > we
> > > > > > > > > > can
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > mode.
> > > > > > > > > > > > > > > > > > > >> > > However,
> > > > > > > > > > > > > > > > > > > >> > > > >> probably we should just copy
> > > > what ZK
> > > > > > > mode
> > > > > > > > > > does
> > > > > > > > > > > > and
> > > > > > > > > > > > > > send
> > > > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > >> full
> > > > > > > > > > > > > > > > > > > >> > > > >> metadata update. This will
> > allow
> > > > us
> > > > > > to
> > > > > > > > have
> > > > > > > > > > an
> > > > > > > > > > > > easy
> > > > > > > > > > > > > > way
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > handle
> > > > > > > > > > > > > > > > > > > >> > > > >> divergences caused by lost
> > RPCs
> > > > by
> > > > > > > > changing
> > > > > > > > > > the
> > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > (just
> > > > > > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > > > > > >> > we
> > > > > > > > > > > > > > > > > > > >> > > > do
> > > > > > > > > > > > > > > > > > > >> > > > >> in ZK mode, unfortunately). I
> > > > suppose
> > > > > > > we
> > > > > > > > > > should
> > > > > > > > > > > > > > > document
> > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > >> > > > KIP...
> > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > >> > > > >> I agree controlled shutdown
> > is
> > > > tricky
> > > > > > > to
> > > > > > > > > get
> > > > > > > > > > > just
> > > > > > > > > > > > > > > right.
> > > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > > > >> suppose
> > > > > > > > > > > > > > > > > > > >> > > this
> > > > > > > > > > > > > > > > > > > >> > > > is
> > > > > > > > > > > > > > > > > > > >> > > > >> a case where the RPCs we send
> > > > out are
> > > > > > > not
> > > > > > > > > > > purely
> > > > > > > > > > > > > > "fire
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > >> forget";
> > > > > > > > > > > > > > > > > > > >> > we
> > > > > > > > > > > > > > > > > > > >> > > > have
> > > > > > > > > > > > > > > > > > > >> > > > >> to listen for the response.
> > But
> > > > that
> > > > > > > can
> > > > > > > > be
> > > > > > > > > > > done
> > > > > > > > > > > > in
> > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > >> event-based
> > > > > > > > > > > > > > > > > > > >> > > way.
> > > > > > > > > > > > > > > > > > > >> > > > >> Controlled shutdown is also
> > > > probably
> > > > > > > the
> > > > > > > > > last
> > > > > > > > > > > > thing
> > > > > > > > > > > > > > > we'll
> > > > > > > > > > > > > > > > > > > >> implement
> > > > > > > > > > > > > > > > > > > >> > > > once we
> > > > > > > > > > > > > > > > > > > >> > > > >> have the basic lift and
> > shift.
> > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > >> > > > >> Topic deletion is another
> > > > annoying
> > > > > > > > thing. I
> > > > > > > > > > > > wonder
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > >> just
> > > > > > > > > > > > > > > > > > > >> > > > delete
> > > > > > > > > > > > > > > > > > > >> > > > >> immediately here and skip the
> > > > > > > complexity
> > > > > > > > of
> > > > > > > > > > > > > > > implementing
> > > > > > > > > > > > > > > > > > > >> "deleting
> > > > > > > > > > > > > > > > > > > >> > > > state."
> > > > > > > > > > > > > > > > > > > >> > > > >> Topic IDs will exist in IBP
> > 3.4,
> > > > in
> > > > > > > both
> > > > > > > > ZK
> > > > > > > > > > and
> > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > mode,
> > > > > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > >> > > > should be
> > > > > > > > > > > > > > > > > > > >> > > > >> OK, right?
> > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > >> > > > >> best,
> > > > > > > > > > > > > > > > > > > >> > > > >> Colin
> > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> > Thanks,
> > > > > > > > > > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> > Jun
> > > > > > > > > > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> > On Fri, Oct 14, 2022 at
> > 9:29 AM
> > > > > > Colin
> > > > > > > > > > McCabe
> > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > >> cmcc...@apache.org>
> > > > > > > > > > > > > > > > > > > >> > > > wrote:
> > > > > > > > > > > > > > > > > > > >> > > > >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> On Thu, Oct 13, 2022, at
> > > > 11:44,
> > > > > > Jun
> > > > > > > > Rao
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > Hi, Colin,
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > Thanks for the reply.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > 10. This is a bit on the
> > > > > > > > > implementation
> > > > > > > > > > > > side.
> > > > > > > > > > > > > If
> > > > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > >> > > > >> existing
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > ZK-based controller,
> > most
> > > > of the
> > > > > > > > logic
> > > > > > > > > > is
> > > > > > > > > > > > > around
> > > > > > > > > > > > > > > > > > > >> maintaining an
> > > > > > > > > > > > > > > > > > > >> > > > >> in-memory
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > state of all the
> > resources
> > > > > > > (broker,
> > > > > > > > > > topic,
> > > > > > > > > > > > > > > > partition,
> > > > > > > > > > > > > > > > > > > etc),
> > > > > > > > > > > > > > > > > > > >> > > > >> >> reading/writing
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > to ZK, sending
> > LeaderAndIsr
> > > > and
> > > > > > > > > > > > UpdateMetadata
> > > > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > >> > > > handling
> > > > > > > > > > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > responses to brokers.
> > So we
> > > > need
> > > > > > > all
> > > > > > > > > > that
> > > > > > > > > > > > > logic
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > dual
> > > > > > > > > > > > > > > > > > > >> > write
> > > > > > > > > > > > > > > > > > > >> > > > >> mode.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> One
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > option is to duplicate
> > all
> > > > that
> > > > > > > > logic
> > > > > > > > > in
> > > > > > > > > > > > some
> > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > code.
> > > > > > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > > > >> > can
> > > > > > > > > > > > > > > > > > > >> > > > be a
> > > > > > > > > > > > > > > > > > > >> > > > >> bit
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > error prone and makes
> > the
> > > > code a
> > > > > > > bit
> > > > > > > > > > > harder
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > maintain
> > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > >> we
> > > > > > > > > > > > > > > > > > > >> > > need
> > > > > > > > > > > > > > > > > > > >> > > > to
> > > > > > > > > > > > > > > > > > > >> > > > >> fix
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > some critical issues in
> > > > ZK-based
> > > > > > > > > > > > controllers.
> > > > > > > > > > > > > > > > Another
> > > > > > > > > > > > > > > > > > > >> option is
> > > > > > > > > > > > > > > > > > > >> > > to
> > > > > > > > > > > > > > > > > > > >> > > > try
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > reusing the existing
> > code
> > > > in the
> > > > > > > > > > ZK-based
> > > > > > > > > > > > > > > > controller.
> > > > > > > > > > > > > > > > > > For
> > > > > > > > > > > > > > > > > > > >> > > example,
> > > > > > > > > > > > > > > > > > > >> > > > we
> > > > > > > > > > > > > > > > > > > >> > > > >> >> could
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > start the EventManager
> > in
> > > > the
> > > > > > > > ZK-based
> > > > > > > > > > > > > > controller,
> > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > let
> > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > >> > > > KRaft
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > controller ingest new
> > > > events.
> > > > > > This
> > > > > > > > has
> > > > > > > > > > its
> > > > > > > > > > > > own
> > > > > > > > > > > > > > > > > > challenges:
> > > > > > > > > > > > > > > > > > > >> (1)
> > > > > > > > > > > > > > > > > > > >> > > the
> > > > > > > > > > > > > > > > > > > >> > > > >> >> existing
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > logic only logs ZK
> > failures
> > > > and
> > > > > > > > > doesn't
> > > > > > > > > > > > expose
> > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > > caller;
> > > > > > > > > > > > > > > > > > > >> > > > (2)
> > > > > > > > > > > > > > > > > > > >> > > > >> >> the
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > existing logic may add
> > new
> > > > > > events
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > > queue
> > > > > > > > > > > > > > > > itself
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > >> > > > probably
> > > > > > > > > > > > > > > > > > > >> > > > >> >> need
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > to think through how
> > this is
> > > > > > > > > coordinated
> > > > > > > > > > > > with
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > >> > > controller;
> > > > > > > > > > > > > > > > > > > >> > > > >> (3)
> > > > > > > > > > > > > > > > > > > >> > > > >> >> it
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > registers some ZK
> > listeners
> > > > > > > > > > unnecessarily
> > > > > > > > > > > > (may
> > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > big
> > > > > > > > > > > > > > > > > > > >> > > > concern).
> > > > > > > > > > > > > > > > > > > >> > > > >> So
> > > > > > > > > > > > > > > > > > > >> > > > >> >> we
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > need to get around those
> > > > issues
> > > > > > > > > > somehow. I
> > > > > > > > > > > > am
> > > > > > > > > > > > > > > > > wondering
> > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > >> we
> > > > > > > > > > > > > > > > > > > >> > > have
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > considered both options
> > and
> > > > > > which
> > > > > > > > > > approach
> > > > > > > > > > > > we
> > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > leaning
> > > > > > > > > > > > > > > > > > > >> > towards
> > > > > > > > > > > > > > > > > > > >> > > > for
> > > > > > > > > > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > implementation.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> Yes, this is a good
> > question.
> > > > My
> > > > > > > take
> > > > > > > > is
> > > > > > > > > > > that
> > > > > > > > > > > > a
> > > > > > > > > > > > > > big
> > > > > > > > > > > > > > > > part
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > >> > > > >> complexity
> > > > > > > > > > > > > > > > > > > >> > > > >> >> in the old controller code
> > > > results
> > > > > > > > from
> > > > > > > > > > the
> > > > > > > > > > > > fact
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > >> ZK
> > > > > > > > > > > > > > > > > > > >> > as
> > > > > > > > > > > > > > > > > > > >> > > a
> > > > > > > > > > > > > > > > > > > >> > > > >> >> multi-writer database for
> > > > > > > propagating
> > > > > > > > > > > > > information
> > > > > > > > > > > > > > > > > between
> > > > > > > > > > > > > > > > > > > >> > different
> > > > > > > > > > > > > > > > > > > >> > > > >> >> components. So in the old
> > > > > > > controller,
> > > > > > > > > > every
> > > > > > > > > > > > > write
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > >> > be
> > > > > > > > > > > > > > > > > > > >> > > > >> >> structured as a compare
> > and
> > > > swap
> > > > > > to
> > > > > > > be
> > > > > > > > > > fully
> > > > > > > > > > > > > > > correct.
> > > > > > > > > > > > > > > > > > Every
> > > > > > > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > > > > > > >> > we
> > > > > > > > > > > > > > > > > > > >> > > > get
> > > > > > > > > > > > > > > > > > > >> > > > >> >> notified about something,
> > it's
> > > > > > > usually
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > > > form
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > "this
> > > > > > > > > > > > > > > > > > > >> znode
> > > > > > > > > > > > > > > > > > > >> > > > >> changed"
> > > > > > > > > > > > > > > > > > > >> > > > >> >> which prompts a full
> > reload of
> > > > > > part
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > > data
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > (which
> > > > > > > > > > > > > > > > > > > >> > itself
> > > > > > > > > > > > > > > > > > > >> > > > has
> > > > > > > > > > > > > > > > > > > >> > > > >> >> multiple parts, loading,
> > > > > > > > deserializing,
> > > > > > > > > > > > > > reconciling,
> > > > > > > > > > > > > > > > > etc.)
> > > > > > > > > > > > > > > > > > > >> That
> > > > > > > > > > > > > > > > > > > >> > all
> > > > > > > > > > > > > > > > > > > >> > > > goes
> > > > > > > > > > > > > > > > > > > >> > > > >> >> away in the new mode, and
> > we
> > > > just
> > > > > > > have
> > > > > > > > > > some
> > > > > > > > > > > > code
> > > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > > > >> analyzes
> > > > > > > > > > > > > > > > > > > >> > > > >> >> __cluster_metadata and
> > > > reflects it
> > > > > > > in
> > > > > > > > 1)
> > > > > > > > > > > > updates
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > 2)
> > > > > > > > > > > > > > > > > > > >> > > > messages
> > > > > > > > > > > > > > > > > > > >> > > > >> sent
> > > > > > > > > > > > > > > > > > > >> > > > >> >> out to brokers.
> > > > > > > > > > > > > > > > > > > >> > > > >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> This is pretty decoupled
> > from
> > > > the
> > > > > > > > other
> > > > > > > > > > > logic
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > >> QuorumController
> > > > > > > > > > > > > > > > > > > >> > > and
> > > > > > > > > > > > > > > > > > > >> > > > >> >> should be easy to unit
> > test,
> > > > since
> > > > > > > the
> > > > > > > > > > same
> > > > > > > > > > > > > inputs
> > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> log
> > > > > > > > > > > > > > > > > > > >> > > > always
> > > > > > > > > > > > > > > > > > > >> > > > >> >> produce the same output
> > in ZK.
> > > > > > > > > Basically,
> > > > > > > > > > ZK
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > write-only
> > > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > >> > us,
> > > > > > > > > > > > > > > > > > > >> > > > we do
> > > > > > > > > > > > > > > > > > > >> > > > >> >> not read it (with the big
> > > > > > exception
> > > > > > > of
> > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > registration
> > > > > > > > > > > > > > > > > > > >> > znodes)
> > > > > > > > > > > > > > > > > > > >> > > > and I
> > > > > > > > > > > > > > > > > > > >> > > > >> >> think that will greatly
> > > > simplify
> > > > > > > > things.
> > > > > > > > > > > > > > > > > > > >> > > > >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> So I think dual-write
> > mode as
> > > > > > > > described
> > > > > > > > > > here
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > >> > substantially
> > > > > > > > > > > > > > > > > > > >> > > > >> simpler
> > > > > > > > > > > > > > > > > > > >> > > > >> >> than trying to run part or
> > > > all of
> > > > > > > the
> > > > > > > > > old
> > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > >> > parallel. I
> > > > > > > > > > > > > > > > > > > >> > > > do
> > > > > > > > > > > > > > > > > > > >> > > > >> >> think we will reuse a
> > bunch
> > > > of the
> > > > > > > > > > > > > serialization /
> > > > > > > > > > > > > > > > > > > >> > deserialization
> > > > > > > > > > > > > > > > > > > >> > > > code
> > > > > > > > > > > > > > > > > > > >> > > > >> for
> > > > > > > > > > > > > > > > > > > >> > > > >> >> znodes and possibly the
> > code
> > > > for
> > > > > > > > > > > communicating
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > ZK.
> > > > > > > > > > > > > > > > > > > >> > > > >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> best,
> > > > > > > > > > > > > > > > > > > >> > > > >> >> Colin
> > > > > > > > > > > > > > > > > > > >> > > > >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > 14. Good point and make
> > > > sense.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > Thanks,
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > Jun
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> > On Wed, Oct 12, 2022 at
> > > > 3:27 PM
> > > > > > > > Colin
> > > > > > > > > > > > McCabe <
> > > > > > > > > > > > > > > > > > > >> > cmcc...@apache.org
> > > > > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > > > > >> > > > >> wrote:
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> Hi Jun,
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> Thanks for taking a
> > look.
> > > > I can
> > > > > > > > > answer
> > > > > > > > > > > some
> > > > > > > > > > > > > > > > questions
> > > > > > > > > > > > > > > > > > > here
> > > > > > > > > > > > > > > > > > > >> > > > because I
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> collaborated on this a
> > > > bit, and
> > > > > > > > David
> > > > > > > > > > is
> > > > > > > > > > > on
> > > > > > > > > > > > > > > > vacation
> > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > >> few
> > > > > > > > > > > > > > > > > > > >> > > > days.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> On Wed, Oct 12, 2022,
> > at
> > > > 14:41,
> > > > > > > Jun
> > > > > > > > > Rao
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > Hi, David,
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > Thanks for the KIP.
> > A few
> > > > > > > > comments
> > > > > > > > > > > below.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > 10. It's still not
> > very
> > > > clear
> > > > > > > to
> > > > > > > > me
> > > > > > > > > > how
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > >> controller
> > > > > > > > > > > > > > > > > > > >> > > > works
> > > > > > > > > > > > > > > > > > > >> > > > >> in
> > > > > > > > > > > > > > > > > > > >> > > > >> >> the
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > dual writes mode to
> > > > KRaft log
> > > > > > > and
> > > > > > > > > ZK
> > > > > > > > > > > when
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > brokers
> > > > > > > > > > > > > > > > > > > >> still
> > > > > > > > > > > > > > > > > > > >> > > run
> > > > > > > > > > > > > > > > > > > >> > > > in
> > > > > > > > > > > > > > > > > > > >> > > > >> ZK
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> mode.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > Does the KRaft
> > controller
> > > > > > run a
> > > > > > > > ZK
> > > > > > > > > > > based
> > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > >> > > parallel
> > > > > > > > > > > > > > > > > > > >> > > > or
> > > > > > > > > > > > > > > > > > > >> > > > >> do
> > > > > > > > > > > > > > > > > > > >> > > > >> >> we
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > derive what needs to
> > be
> > > > > > written
> > > > > > > > to
> > > > > > > > > ZK
> > > > > > > > > > > > based
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > >> > > controller
> > > > > > > > > > > > > > > > > > > >> > > > >> logic?
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> We derive what needs
> > to be
> > > > > > > written
> > > > > > > > to
> > > > > > > > > > ZK
> > > > > > > > > > > > > based
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > >> > > controller
> > > > > > > > > > > > > > > > > > > >> > > > >> >> logic.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > I am also not sure
> > how
> > > > the
> > > > > > > KRaft
> > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > handles
> > > > > > > > > > > > > > > > > > > >> broker
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > registration/deregistration,
> > > > > > > > since
> > > > > > > > > > > > brokers
> > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > still
> > > > > > > > > > > > > > > > > > > >> running
> > > > > > > > > > > > > > > > > > > >> > > in
> > > > > > > > > > > > > > > > > > > >> > > > ZK
> > > > > > > > > > > > > > > > > > > >> > > > >> >> mode
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> and
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > are not heartbeating
> > to
> > > > the
> > > > > > > KRaft
> > > > > > > > > > > > > controller.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> The new controller will
> > > > listen
> > > > > > > for
> > > > > > > > > > broker
> > > > > > > > > > > > > > > > > registrations
> > > > > > > > > > > > > > > > > > > >> under
> > > > > > > > > > > > > > > > > > > >> > > > >> /brokers.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> This is the only znode
> > > > watch
> > > > > > that
> > > > > > > > the
> > > > > > > > > > new
> > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > >> do.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> We did consider
> > changing
> > > > how
> > > > > > > > ZK-based
> > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > registration
> > > > > > > > > > > > > > > > > > > >> > > worked,
> > > > > > > > > > > > > > > > > > > >> > > > >> but it
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> just ended up being too
> > > > much
> > > > > > work
> > > > > > > > for
> > > > > > > > > > not
> > > > > > > > > > > > > > enough
> > > > > > > > > > > > > > > > > gain.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > 12. "A new set of
> > nodes
> > > > will
> > > > > > be
> > > > > > > > > > > > provisioned
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > host
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > > > controller
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> quorum."
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > I guess we don't
> > support
> > > > > > > starting
> > > > > > > > > the
> > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > >> > quorum
> > > > > > > > > > > > > > > > > > > >> > > on
> > > > > > > > > > > > > > > > > > > >> > > > >> >> existing
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > brokers. It would be
> > > > useful
> > > > > > to
> > > > > > > > make
> > > > > > > > > > > that
> > > > > > > > > > > > > > clear.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> Agreed
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > 13. "Once the quorum
> > is
> > > > > > > > established
> > > > > > > > > > > and a
> > > > > > > > > > > > > > > leader
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > >> elected,
> > > > > > > > > > > > > > > > > > > >> > > the
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> controller
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > will check the state
> > of
> > > > the
> > > > > > > > cluster
> > > > > > > > > > > using
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> MigrationCheck
> > > > > > > > > > > > > > > > > > > >> > > > RPC."
> > > > > > > > > > > > > > > > > > > >> > > > >> How
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> does
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > the quorum controller
> > > > detect
> > > > > > > > other
> > > > > > > > > > > > brokers?
> > > > > > > > > > > > > > > Does
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > > controller
> > > > > > > > > > > > > > > > > > > >> > > > >> node
> > > > > > > > > > > > > > > > > > > >> > > > >> >> need
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > to be configured
> > with ZK
> > > > > > > > connection
> > > > > > > > > > > > string?
> > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > so,
> > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > >> would
> > > > > > > > > > > > > > > > > > > >> > be
> > > > > > > > > > > > > > > > > > > >> > > > >> useful
> > > > > > > > > > > > > > > > > > > >> > > > >> >> to
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > document the
> > additional
> > > > > > configs
> > > > > > > > > that
> > > > > > > > > > > the
> > > > > > > > > > > > > > quorum
> > > > > > > > > > > > > > > > > > > >> controller
> > > > > > > > > > > > > > > > > > > >> > > > needs to
> > > > > > > > > > > > > > > > > > > >> > > > >> >> set.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> Yes, the controllers
> > > > monitor ZK
> > > > > > > for
> > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > registrations,
> > > > > > > > > > > > > > > > > > > >> as I
> > > > > > > > > > > > > > > > > > > >> > > > >> mentioned
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> above. So they need
> > > > zk.connect
> > > > > > > and
> > > > > > > > > the
> > > > > > > > > > > > other
> > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > connection
> > > > > > > > > > > > > > > > > > > >> > > > >> >> configurations.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > 14. "In order to
> > prevent
> > > > > > > further
> > > > > > > > > > writes
> > > > > > > > > > > > to
> > > > > > > > > > > > > > ZK,
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > > > > >> > thing
> > > > > > > > > > > > > > > > > > > >> > > > the
> > > > > > > > > > > > > > > > > > > >> > > > >> new
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > KRaft quorum must do
> > is
> > > > take
> > > > > > > over
> > > > > > > > > > > > > leadership
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > > >> > > > controller.
> > > > > > > > > > > > > > > > > > > >> > > > >> "
> > > > > > > > > > > > > > > > > > > >> > > > >> >> The
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> ZK
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > controller processing
> > > > changes
> > > > > > > to
> > > > > > > > > > > > > /controller
> > > > > > > > > > > > > > > > update
> > > > > > > > > > > > > > > > > > > >> > > > asynchronously.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> How
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > does the KRaft
> > controller
> > > > > > know
> > > > > > > > when
> > > > > > > > > > the
> > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > > > > > >> > > > resigned
> > > > > > > > > > > > > > > > > > > >> > > > >> >> before
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > it can safely copy
> > the ZK
> > > > > > data?
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> This should be done
> > through
> > > > > > > > > > > > > > > > > > > >> expectedControllerEpochZkVersion,
> > > > > > > > > > > > > > > > > > > >> > > just
> > > > > > > > > > > > > > > > > > > >> > > > >> like
> > > > > > > > > > > > > > > > > > > >> > > > >> >> in
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> ZK mode, right? We
> > should
> > > > bump
> > > > > > > this
> > > > > > > > > > epoch
> > > > > > > > > > > > > value
> > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > > >> > > writes
> > > > > > > > > > > > > > > > > > > >> > > > >> from
> > > > > > > > > > > > > > > > > > > >> > > > >> >> the
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> old controller will
> > not go
> > > > > > > > through. I
> > > > > > > > > > > agree
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > > spell
> > > > > > > > > > > > > > > > > > > >> > this
> > > > > > > > > > > > > > > > > > > >> > > > out
> > > > > > > > > > > > > > > > > > > >> > > > >> in
> > > > > > > > > > > > > > > > > > > >> > > > >> >> the
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> KIP.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > 15. We have the
> > following
> > > > > > > > > sentences.
> > > > > > > > > > > One
> > > > > > > > > > > > > says
> > > > > > > > > > > > > > > > > > > >> ControllerId
> > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > >> > > a
> > > > > > > > > > > > > > > > > > > >> > > > >> random
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > KRaft broker and the
> > > > other
> > > > > > says
> > > > > > > > > it's
> > > > > > > > > > > the
> > > > > > > > > > > > > > active
> > > > > > > > > > > > > > > > > > > >> controller.
> > > > > > > > > > > > > > > > > > > >> > > > Which
> > > > > > > > > > > > > > > > > > > >> > > > >> one
> > > > > > > > > > > > > > > > > > > >> > > > >> >> is
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > correct?
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > "UpdateMetadata: for
> > > > certain
> > > > > > > > > metadata
> > > > > > > > > > > > > > changes,
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > >> > > > controller
> > > > > > > > > > > > > > > > > > > >> > > > >> >> will
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > need to send
> > > > > > > > UpdateMetadataRequests
> > > > > > > > > > to
> > > > > > > > > > > > the
> > > > > > > > > > > > > ZK
> > > > > > > > > > > > > > > > > > brokers.
> > > > > > > > > > > > > > > > > > > >> For
> > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > “ControllerId” field
> > in
> > > > this
> > > > > > > > > request,
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > controller
> > > > > > > > > > > > > > > > > > > >> should
> > > > > > > > > > > > > > > > > > > >> > > > >> specify a
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > random KRaft broker."
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > "In the
> > > > UpdateMetadataRequest
> > > > > > > > sent
> > > > > > > > > by
> > > > > > > > > > > the
> > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > > > > >> controller
> > > > > > > > > > > > > > > > > > > >> > to
> > > > > > > > > > > > > > > > > > > >> > > > the
> > > > > > > > > > > > > > > > > > > >> > > > >> ZK
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > brokers, the
> > ControllerId
> > > > > > will
> > > > > > > > > point
> > > > > > > > > > to
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > active
> > > > > > > > > > > > > > > > > > > >> > controller
> > > > > > > > > > > > > > > > > > > >> > > > which
> > > > > > > > > > > > > > > > > > > >> > > > >> >> will
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> be
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > used for the
> > inter-broker
> > > > > > > > > requests."
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> Yeah, this seems like
> > an
> > > > error
> > > > > > to
> > > > > > > > me
> > > > > > > > > as
> > > > > > > > > > > > > well. A
> > > > > > > > > > > > > > > > > random
> > > > > > > > > > > > > > > > > > > >> value
> > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > >> > > > not
> > > > > > > > > > > > > > > > > > > >> > > > >> >> really
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> useful. Plus the text
> > here
> > > > is
> > > > > > > > > > > > > > self-contradictory,
> > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > > > > >> > pointed
> > > > > > > > > > > > > > > > > > > >> > > > out.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> I suspect what we
> > should do
> > > > > > here
> > > > > > > is
> > > > > > > > > > add a
> > > > > > > > > > > > new
> > > > > > > > > > > > > > > > field,
> > > > > > > > > > > > > > > > > > > >> > > > >> KRaftControllerId,
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> and populate it with
> > the
> > > > real
> > > > > > > > > > controller
> > > > > > > > > > > > ID,
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > leave
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> old
> > > > > > > > > > > > > > > > > > > >> > > > >> >> controllerId
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> field as -1. A ZK-based
> > > > broker
> > > > > > > that
> > > > > > > > > > sees
> > > > > > > > > > > > this
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > then
> > > > > > > > > > > > > > > > > > > >> consult
> > > > > > > > > > > > > > > > > > > >> > > its
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > controller.quorum.voters
> > > > > > > > > configuration
> > > > > > > > > > to
> > > > > > > > > > > > see
> > > > > > > > > > > > > > > where
> > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > >> should
> > > > > > > > > > > > > > > > > > > >> > > send
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> controller-bound RPCs.
> > That
> > > > > > > > (static)
> > > > > > > > > > > > > > > configuration
> > > > > > > > > > > > > > > > > lets
> > > > > > > > > > > > > > > > > > > us
> > > > > > > > > > > > > > > > > > > >> map
> > > > > > > > > > > > > > > > > > > >> > > > >> between
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> controller ID and
> > > > host:port.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> We should still keep
> > our
> > > > > > existing
> > > > > > > > > epoch
> > > > > > > > > > > > logic
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > deciding
> > > > > > > > > > > > > > > > > > > >> > when
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> UpdateMetadataRequest /
> > > > > > > > > > > > LeaderAndIsrRequests
> > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > stale,
> > > > > > > > > > > > > > > > > > > >> with
> > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > >> > > > >> caveat
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> that any kraft-based
> > epoch
> > > > > > should
> > > > > > > > be
> > > > > > > > > > > > treated
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > greater
> > > > > > > > > > > > > > > > > > > >> than
> > > > > > > > > > > > > > > > > > > >> > any
> > > > > > > > > > > > > > > > > > > >> > > > >> >> ZK-based
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> epoch. After all, the
> > kraft
> > > > > > epoch
> > > > > > > > is
> > > > > > > > > > > coming
> > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > epoch
> > > > > > > > > > > > > > > > > > > >> of
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> __cluster_metadata,
> > > > whereas the
> > > > > > > ZK
> > > > > > > > > > epoch
> > > > > > > > > > > > > comes
> > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > ZK.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > 16. "Additionally,
> > the
> > > > > > > controller
> > > > > > > > > > must
> > > > > > > > > > > > > > specify
> > > > > > > > > > > > > > > > if a
> > > > > > > > > > > > > > > > > > > >> broker
> > > > > > > > > > > > > > > > > > > >> > in
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> “LiveBrokers”
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > is KRaft or ZK." Does
> > > > that
> > > > > > > > require
> > > > > > > > > > any
> > > > > > > > > > > > > > protocol
> > > > > > > > > > > > > > > > > > changes
> > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> UpdateMetadata?
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> Yeah, I am also curious
> > > > why the
> > > > > > > we
> > > > > > > > > need
> > > > > > > > > > > to
> > > > > > > > > > > > > care
> > > > > > > > > > > > > > > > > whether
> > > > > > > > > > > > > > > > > > > >> > brokers
> > > > > > > > > > > > > > > > > > > >> > > > are
> > > > > > > > > > > > > > > > > > > >> > > > >> ZK
> > > > > > > > > > > > > > > > > > > >> > > > >> >> or
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> KRaft in
> > > > UpdateMetadataRequest.
> > > > > > > We
> > > > > > > > > > don't
> > > > > > > > > > > > > reveal
> > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > clients,
> > > > > > > > > > > > > > > > > > > >> > > > so
> > > > > > > > > > > > > > > > > > > >> > > > >> can
> > > > > > > > > > > > > > > > > > > >> > > > >> >> we
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> just leave this out?
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> best,
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> Colin
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > Thanks,
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > Jun
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > On Wed, Oct 5, 2022
> > at
> > > > 10:07
> > > > > > AM
> > > > > > > > > > Mickael
> > > > > > > > > > > > > > Maison
> > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > >> > > > >> >> mickael.mai...@gmail.com
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> > wrote:
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> Hi David,
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> Thanks for starting
> > this
> > > > > > > > important
> > > > > > > > > > > KIP.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> I've just taken a
> > quick
> > > > look
> > > > > > > so
> > > > > > > > > far
> > > > > > > > > > > but
> > > > > > > > > > > > > I've
> > > > > > > > > > > > > > > > got a
> > > > > > > > > > > > > > > > > > > >> couple
> > > > > > > > > > > > > > > > > > > >> > of
> > > > > > > > > > > > > > > > > > > >> > > > >> initial
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> questions:
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> 1) What happens if
> > a non
> > > > > > KRaft
> > > > > > > > > > > > compatible
> > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > (or
> > > > > > > > > > > > > > > > > > > >> with
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > kafka.metadata.migration.enable
> > > > > > > > > set
> > > > > > > > > > to
> > > > > > > > > > > > > > false)
> > > > > > > > > > > > > > > > > joins
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > > cluster
> > > > > > > > > > > > > > > > > > > >> > > > >> after
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> the migration is
> > > > triggered?
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> 2) In the Failure
> > Modes
> > > > > > > section
> > > > > > > > > you
> > > > > > > > > > > > > mention
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > scenario
> > > > > > > > > > > > > > > > > > > >> > where
> > > > > > > > > > > > > > > > > > > >> > > a
> > > > > > > > > > > > > > > > > > > >> > > > >> write
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> to ZK fails. What
> > > > happens
> > > > > > when
> > > > > > > > the
> > > > > > > > > > > > > > divergence
> > > > > > > > > > > > > > > > > limit
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > >> > > > reached? Is
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> this a fatal
> > condition?
> > > > How
> > > > > > > much
> > > > > > > > > > > > > divergence
> > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > >> > allow?
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> Thanks,
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> Mickael
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> On Wed, Oct 5, 2022
> > at
> > > > 12:20
> > > > > > > AM
> > > > > > > > > > David
> > > > > > > > > > > > > > Arthur <
> > > > > > > > > > > > > > > > > > > >> > > mum...@gmail.com
> > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> wrote:
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > Hey folks, I
> > wanted
> > > > to get
> > > > > > > the
> > > > > > > > > > ball
> > > > > > > > > > > > > > rolling
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > > discussion
> > > > > > > > > > > > > > > > > > > >> > > > >> for
> > > > > > > > > > > > > > > > > > > >> > > > >> >> the
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > ZooKeeper
> > migration
> > > > KIP.
> > > > > > > This
> > > > > > > > > KIP
> > > > > > > > > > > > > details
> > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > plan
> > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > >> > do
> > > > > > > > > > > > > > > > > > > >> > > > an
> > > > > > > > > > > > > > > > > > > >> > > > >> >> online
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > migration of
> > metadata
> > > > from
> > > > > > > > > > ZooKeeper
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > KRaft
> > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > > well
> > > > > > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > > >> > > > >> rolling
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > upgrade of
> > brokers to
> > > > > > KRaft
> > > > > > > > > mode.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > The general idea
> > is to
> > > > > > keep
> > > > > > > > > KRaft
> > > > > > > > > > > and
> > > > > > > > > > > > > > > > ZooKeeper
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > >> sync
> > > > > > > > > > > > > > > > > > > >> > > > during
> > > > > > > > > > > > > > > > > > > >> > > > >> the
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > migration, so both
> > > > types
> > > > > > of
> > > > > > > > > > brokers
> > > > > > > > > > > > can
> > > > > > > > > > > > > > > exist
> > > > > > > > > > > > > > > > > > > >> > > simultaneously.
> > > > > > > > > > > > > > > > > > > >> > > > >> Then,
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > once everything is
> > > > > > migrated
> > > > > > > > and
> > > > > > > > > > > > updated,
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > turn
> > > > > > > > > > > > > > > > > > > >> off
> > > > > > > > > > > > > > > > > > > >> > > > >> ZooKeeper
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > writes.
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > This is a pretty
> > > > complex
> > > > > > > KIP,
> > > > > > > > so
> > > > > > > > > > > > please
> > > > > > > > > > > > > > > take a
> > > > > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > > > > :)
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-866+ZooKeeper+to+KRaft+Migration
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> >
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > Thanks!
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >> > David
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >> >>
> > > > > > > > > > > > > > > > > > > >> > > > >>
> > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > > > > --
> > > > > > > > > > > > > > > > > > > >> > > > > -David
> > > > > > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > > --
> > > > > > > > > > > > > > > > > > > >> > > -David
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > -David
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > -David
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > -David
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > -David
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > -David
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > -David
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > -David
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > -David
> > > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > David Arthur
> > > >
> >
> >
> >
> > --
> > David Arthur
> >



-- 
David Arthur

Reply via email to