Hi, Kowshik,

Thanks for the update. Those changes look good to me.

Jun

On Tue, Oct 13, 2020 at 4:50 PM Kowshik Prakasam <kpraka...@confluent.io>
wrote:

> Hi all,
>
> I wanted to let you know that I have made the following minor changes to
> the `kafka-features` CLI tool description in the KIP-584 write up. The
> purpose is to ensure the design is correct for a few things which came up
> during implementation:
>
> 1. The CLI tool now produces a tab-formatted output instead of JSON. This
> aligns with the type of format produced by other admin CLI tools of Kafka,
> ex: `kafka-topics`.
> 2. Whenever feature updates are performed, the output of the CLI tool shows
> the result of each feature update that was applied.
> 3. The CLI tool accepts an optional argument `--dry-run` which lets the
> user preview the feature updates before applying them.
>
> The following section of the KIP has been updated with the above changes:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP584:Versioningschemeforfeatures-Toolingsupport
>
> Please let me know if you have any questions.
>
>
> Cheers,
> Kowshik
>
>
> On Thu, Oct 8, 2020 at 1:12 AM Kowshik Prakasam <kpraka...@confluent.io>
> wrote:
>
> > Hi Jun,
> >
> > This is a very good point. I have updated the feature version deprecation
> > section mentioning the same:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP584:Versioningschemeforfeatures-Featureversiondeprecation
> > .
> >
> > Thank you for the suggestion.
> >
> >
> > Cheers,
> > Kowshik
> >
> >
> > On Tue, Oct 6, 2020 at 5:30 PM Jun Rao <j...@confluent.io> wrote:
> >
> >> Hi, Kowshik,
> >>
> >> Thanks for the follow up. Both look good to me.
> >>
> >> For 2, it would be useful to also add that an admin should make sure
> that
> >> no clients are using a deprecated feature version (e.g. using the client
> >> version metric) before deploying a release that deprecates it.
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >> On Tue, Oct 6, 2020 at 3:46 PM Kowshik Prakasam <kpraka...@confluent.io
> >
> >> wrote:
> >>
> >> > Hi Jun,
> >> >
> >> > I have added the following details in the KIP-584 write up:
> >> >
> >> > 1. Deployment, IBP deprecation and avoidance of double rolls. This
> >> section
> >> > talks about the various phases of work that would be required to use
> >> this
> >> > KIP to eventually avoid Broker double rolls in the cluster (whenever
> IBP
> >> > values are advanced). Link to section:
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP584:Versioningschemeforfeatures-Deployment,IBPdeprecationandavoidanceofdoublerolls
> >> > .
> >> >
> >> > 2. Feature version deprecation. This section explains the idea for
> >> feature
> >> > version deprecation (using highest supported feature min version)
> which
> >> you
> >> > had proposed during code review:
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP584:Versioningschemeforfeatures-Featureversiondeprecation
> >> > .
> >> >
> >> > Please let me know if you have any questions.
> >> >
> >> >
> >> > Cheers,
> >> > Kowshik
> >> >
> >> >
> >> > On Tue, Sep 29, 2020 at 11:07 AM Jun Rao <j...@confluent.io> wrote:
> >> >
> >> > > Hi, Kowshik,
> >> > >
> >> > > Thanks for the update. Regarding enabling a single rolling restart
> in
> >> the
> >> > > future, could we sketch out a bit how this will work by treating IBP
> >> as a
> >> > > feature? For example, IBP currently uses the release version and
> this
> >> KIP
> >> > > uses an integer for versions. How do we bridge the gap between the
> >> two?
> >> > > Does min.version still make sense for IBP as a feature?
> >> > >
> >> > > Thanks,
> >> > >
> >> > > Jun
> >> > >
> >> > > On Fri, Sep 25, 2020 at 5:57 PM Kowshik Prakasam <
> >> kpraka...@confluent.io
> >> > >
> >> > > wrote:
> >> > >
> >> > > > Hi Colin,
> >> > > >
> >> > > > Thanks for the feedback. Those are very good points. I have made
> the
> >> > > > following changes to the KIP as you had suggested:
> >> > > > 1. Included the `timeoutMs` field in the `UpdateFeaturesRequest`
> >> > schema.
> >> > > > The initial implementation won't be making use of the field, but
> we
> >> can
> >> > > > always use it in the future as the need arises.
> >> > > > 2. Modified the `FinalizedFeaturesEpoch` field in
> >> `ApiVersionsResponse`
> >> > > to
> >> > > > use int64. This is to avoid overflow problems in the future once
> ZK
> >> is
> >> > > > gone.
> >> > > >
> >> > > > I have also incorporated these changes into the versioning write
> >> path
> >> > PR
> >> > > > that is currently under review:
> >> > > https://github.com/apache/kafka/pull/9001.
> >> > > >
> >> > > >
> >> > > > Cheers,
> >> > > > Kowshik
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Fri, Sep 25, 2020 at 4:57 PM Kowshik Prakasam <
> >> > kpraka...@confluent.io
> >> > > >
> >> > > > wrote:
> >> > > >
> >> > > > > Hi Jun,
> >> > > > >
> >> > > > > Thanks for the feedback. It's a very good point. I have now
> >> modified
> >> > > the
> >> > > > > KIP-584 write-up "goals" section a bit. It now mentions one of
> the
> >> > > goals
> >> > > > as
> >> > > > > enabling rolling upgrades using a single restart (instead of 2).
> >> > Also I
> >> > > > > have removed the text explicitly aiming for deprecation of IBP.
> >> Note
> >> > > that
> >> > > > > previously under "Potential features in Kafka" the IBP was
> >> mentioned
> >> > > > under
> >> > > > > point (4) as a possible coarse-grained feature. Hopefully, now
> >> the 2
> >> > > > > sections of the KIP align with each other well.
> >> > > > >
> >> > > > >
> >> > > > > Cheers,
> >> > > > > Kowshik
> >> > > > >
> >> > > > >
> >> > > > > On Fri, Sep 25, 2020 at 2:03 PM Colin McCabe <
> cmcc...@apache.org>
> >> > > wrote:
> >> > > > >
> >> > > > >> On Tue, Sep 22, 2020, at 00:43, Kowshik Prakasam wrote:
> >> > > > >> > Hi all,
> >> > > > >> >
> >> > > > >> > I wanted to let you know that I have made the following
> >> changes to
> >> > > the
> >> > > > >> > KIP-584 write up. The purpose is to ensure the design is
> >> correct
> >> > > for a
> >> > > > >> few
> >> > > > >> > things which came up during implementation:
> >> > > > >> >
> >> > > > >>
> >> > > > >> Hi Kowshik,
> >> > > > >>
> >> > > > >> Thanks for the updates.
> >> > > > >>
> >> > > > >> >
> >> > > > >> > 1. Per FeatureUpdate error code: The UPDATE_FEATURES
> controller
> >> > API
> >> > > is
> >> > > > >> no
> >> > > > >> > longer transactional. Going forward, we allow for individual
> >> > > > >> FeatureUpdate
> >> > > > >> > to succeed/fail in the request. As a result, the response
> >> schema
> >> > now
> >> > > > >> > contains an error code per FeatureUpdate as well as a
> top-level
> >> > > error
> >> > > > >> code.
> >> > > > >> > Overall this is a better design because it better represents
> >> the
> >> > > > nature
> >> > > > >> of
> >> > > > >> > the API: each FeatureUpdate in the request is independent of
> >> the
> >> > > other
> >> > > > >> > updates, and the controller can process/apply these
> >> independently
> >> > to
> >> > > > ZK.
> >> > > > >> > When an UPDATE_FEATURES request fails, this new design
> provides
> >> > > better
> >> > > > >> > clarity to the caller on which FeatureUpdate could not be
> >> applied
> >> > > (via
> >> > > > >> the
> >> > > > >> > individual error codes). In the previous design, we were
> >> unable to
> >> > > > >> achieve
> >> > > > >> > such an increased level of clarity in communicating the error
> >> > codes.
> >> > > > >> >
> >> > > > >>
> >> > > > >> OK
> >> > > > >>
> >> > > > >> >
> >> > > > >> > 2. Due to #1, there were some minor changes required to the
> >> > proposed
> >> > > > >> Admin
> >> > > > >> > APIs (describeFeatures and updateFeatures). A few unnecessary
> >> > public
> >> > > > >> APIs
> >> > > > >> > have been removed, and couple essential ones have been added.
> >> The
> >> > > > latest
> >> > > > >> > changes now represent the latest design.
> >> > > > >> >
> >> > > > >> > 3. The timeoutMs field has been removed from the the
> >> > UPDATE_FEATURES
> >> > > > API
> >> > > > >> > request, since it was not found to be required during
> >> > > implementation.
> >> > > > >> >
> >> > > > >>
> >> > > > >> Please don't get rid of timeoutMs.  timeoutMs is required if
> you
> >> > want
> >> > > to
> >> > > > >> implement the ability to timeout the call if the controller
> can't
> >> > get
> >> > > > to it
> >> > > > >> in time.  This is important for avoiding congestion collapse
> >> where
> >> > the
> >> > > > >> controller collapses under the weight of lots of retries of the
> >> same
> >> > > > set of
> >> > > > >> calls.
> >> > > > >>
> >> > > > >> We may not be able to do it in the initial implementation, but
> we
> >> > will
> >> > > > >> eventually implement this for all the controller-bound RPCs.
> >> > > > >>
> >> > > > >> > >
> >> > > > >> > > 2. Finalized feature version epoch data type has been made
> >> to be
> >> > > > int32
> >> > > > >> > > (instead of int64). The reason is that the epoch value is
> the
> >> > > value
> >> > > > >> of ZK
> >> > > > >> > > node version, whose data type is int32.
> >> > > > >> > >
> >> > > > >>
> >> > > > >> Sorry, I missed this earlier.  Using 16 bit feature levels
> seems
> >> > fine.
> >> > > > >> However, please don't use a 32-bit epoch here.  We deliberately
> >> made
> >> > > the
> >> > > > >> epoch 64 bits to avoid overflow problems in the future once ZK
> is
> >> > > gone.
> >> > > > >>
> >> > > > >> best,
> >> > > > >> Colin
> >> > > > >>
> >> > > > >> > > 3. Introduced a new 'status' field in the '/features' ZK
> node
> >> > > > schema.
> >> > > > >> The
> >> > > > >> > > purpose is to implement Colin's earlier point for the
> >> strategy
> >> > for
> >> > > > >> > > transitioning from not having a /features znode to having
> >> one.
> >> > An
> >> > > > >> > > explanation has been provided in the following section of
> the
> >> > KIP
> >> > > > >> detailing
> >> > > > >> > > the different cases:
> >> > > > >> > >
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP584:Versioningschemeforfeatures-FeatureZKnodestatus
> >> > > > >> > > .
> >> > > > >> > >
> >> > > > >> > > Please let me know if you have any questions or concerns.
> >> > > > >> > >
> >> > > > >> > >
> >> > > > >> > > Cheers,
> >> > > > >> > > Kowshik
> >> > > > >> > >
> >> > > > >> > >
> >> > > > >> > >
> >> > > > >> > > Cheers,
> >> > > > >> > > Kowshik
> >> > > > >> > >
> >> > > > >> > > On Tue, Apr 28, 2020 at 11:24 PM Kowshik Prakasam <
> >> > > > >> kpraka...@confluent.io>
> >> > > > >> > > wrote:
> >> > > > >> > >
> >> > > > >> > >> Hi all,
> >> > > > >> > >>
> >> > > > >> > >> This KIP vote has been open for ~12 days. The summary of
> the
> >> > > votes
> >> > > > is
> >> > > > >> > >> that we have 3 binding votes (Colin, Guozhang, Jun), and 3
> >> > > > >> non-binding
> >> > > > >> > >> votes (David, Dhruvil, Boyang). Therefore, the KIP vote
> >> passes.
> >> > > > I'll
> >> > > > >> mark
> >> > > > >> > >> KIP as accepted and start working on the implementation.
> >> > > > >> > >>
> >> > > > >> > >> Thanks a lot!
> >> > > > >> > >>
> >> > > > >> > >>
> >> > > > >> > >> Cheers,
> >> > > > >> > >> Kowshik
> >> > > > >> > >>
> >> > > > >> > >> On Mon, Apr 27, 2020 at 12:15 PM Colin McCabe <
> >> > > cmcc...@apache.org>
> >> > > > >> wrote:
> >> > > > >> > >>
> >> > > > >> > >>> Thanks, Kowshik.  +1 (binding)
> >> > > > >> > >>>
> >> > > > >> > >>> best,
> >> > > > >> > >>> Colin
> >> > > > >> > >>>
> >> > > > >> > >>> On Sat, Apr 25, 2020, at 13:20, Kowshik Prakasam wrote:
> >> > > > >> > >>> > Hi Colin,
> >> > > > >> > >>> >
> >> > > > >> > >>> > Thanks for the explanation! I agree with you, and I
> have
> >> > > updated
> >> > > > >> the
> >> > > > >> > >>> > KIP.
> >> > > > >> > >>> > Here is a link to relevant section:
> >> > > > >> > >>> >
> >> > > > >> > >>>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Controller:ZKnodebootstrapwithdefaultvalues
> >> > > > >> > >>> >
> >> > > > >> > >>> >
> >> > > > >> > >>> > Cheers,
> >> > > > >> > >>> > Kowshik
> >> > > > >> > >>> >
> >> > > > >> > >>> > On Fri, Apr 24, 2020 at 8:50 PM Colin McCabe <
> >> > > > cmcc...@apache.org>
> >> > > > >> > >>> wrote:
> >> > > > >> > >>> >
> >> > > > >> > >>> > > On Fri, Apr 24, 2020, at 00:01, Kowshik Prakasam
> wrote:
> >> > > > >> > >>> > > > (Kowshik): Great point! However for case #1, I'm
> not
> >> > sure
> >> > > > why
> >> > > > >> we
> >> > > > >> > >>> need to
> >> > > > >> > >>> > > > create a '/features' ZK node with disabled
> features.
> >> > > > Instead,
> >> > > > >> do
> >> > > > >> > >>> you see
> >> > > > >> > >>> > > > any drawback if we just do not create it? i.e. if
> >> IBP is
> >> > > > less
> >> > > > >> than
> >> > > > >> > >>> 2.6,
> >> > > > >> > >>> > > the
> >> > > > >> > >>> > > > controller treats the case as though the versioning
> >> > system
> >> > > > is
> >> > > > >> > >>> completely
> >> > > > >> > >>> > > > disabled, and would not create a non-existing
> >> > '/features'
> >> > > > >> node.
> >> > > > >> > >>> > >
> >> > > > >> > >>> > > Hi Kowshik,
> >> > > > >> > >>> > >
> >> > > > >> > >>> > > When the IBP is less than 2.6, but the software has
> >> been
> >> > > > >> upgraded to
> >> > > > >> > >>> a
> >> > > > >> > >>> > > state where it supports this KIP, that
> >> > > > >> > >>> > >  means the user is upgrading from an earlier version
> of
> >> > the
> >> > > > >> > >>> software.  In
> >> > > > >> > >>> > > this case, we want to start with all the features
> >> disabled
> >> > > and
> >> > > > >> allow
> >> > > > >> > >>> the
> >> > > > >> > >>> > > user to enable them when they are ready.
> >> > > > >> > >>> > >
> >> > > > >> > >>> > > Enabling all the possible features immediately after
> an
> >> > > > upgrade
> >> > > > >> > >>> could be
> >> > > > >> > >>> > > harmful to the cluster.  On the other hand, for a new
> >> > > cluster,
> >> > > > >> we do
> >> > > > >> > >>> want
> >> > > > >> > >>> > > to enable all the possible features immediately . I
> was
> >> > > > >> proposing
> >> > > > >> > >>> this as a
> >> > > > >> > >>> > > way to distinguish the two cases (since the new
> cluster
> >> > will
> >> > > > >> never be
> >> > > > >> > >>> > > started with an old IBP).
> >> > > > >> > >>> > >
> >> > > > >> > >>> > > > Colin MccCabe wrote:
> >> > > > >> > >>> > > > > And now, something a little bit bigger (sorry).
> >> For
> >> > > > >> finalized
> >> > > > >> > >>> > > features,
> >> > > > >> > >>> > > > > why do we need both min_version_level and
> >> > > > max_version_level?
> >> > > > >> > >>> Assuming
> >> > > > >> > >>> > > that
> >> > > > >> > >>> > > > > we want all the brokers to be on the same feature
> >> > > version
> >> > > > >> level,
> >> > > > >> > >>> we
> >> > > > >> > >>> > > really only care
> >> > > > >> > >>> > > > > about three numbers for each feature, right?  The
> >> > > minimum
> >> > > > >> > >>> supported
> >> > > > >> > >>> > > version
> >> > > > >> > >>> > > > > level, the maximum supported version level, and
> the
> >> > > > current
> >> > > > >> > >>> active
> >> > > > >> > >>> > > version level.
> >> > > > >> > >>> > > >
> >> > > > >> > >>> > > > > We don't actually want different brokers to be on
> >> > > > different
> >> > > > >> > >>> versions of
> >> > > > >> > >>> > > > > the same feature, right?  So we can just have one
> >> > number
> >> > > > for
> >> > > > >> > >>> current
> >> > > > >> > >>> > > > > version level, rather than two.  At least that's
> >> what
> >> > I
> >> > > > was
> >> > > > >> > >>> thinking
> >> > > > >> > >>> > > -- let
> >> > > > >> > >>> > > > > me know if I missed something.
> >> > > > >> > >>> > > >
> >> > > > >> > >>> > > > (Kowshik): It is my understanding that the "current
> >> > active
> >> > > > >> version
> >> > > > >> > >>> level"
> >> > > > >> > >>> > > > that you have mentioned, is the
> "max_version_level".
> >> But
> >> > > we
> >> > > > >> still
> >> > > > >> > >>> > > > maintain/publish both min and max version levels,
> >> > because,
> >> > > > the
> >> > > > >> > >>> detail
> >> > > > >> > >>> > > about
> >> > > > >> > >>> > > > min level is useful to external clients. This is
> >> > described
> >> > > > >> below.
> >> > > > >> > >>> > > >
> >> > > > >> > >>> > > > For any feature F, think of the closed range:
> >> > > > >> [min_version_level,
> >> > > > >> > >>> > > > max_version_level] as the range of finalized
> >> versions,
> >> > > > that's
> >> > > > >> > >>> guaranteed
> >> > > > >> > >>> > > to
> >> > > > >> > >>> > > > be supported by all brokers in the cluster.
> >> > > > >> > >>> > > >  - "max_version_level" is the finalized highest
> >> common
> >> > > > version
> >> > > > >> > >>> among all
> >> > > > >> > >>> > > > brokers,
> >> > > > >> > >>> > > >  - "min_version_level" is the finalized lowest
> common
> >> > > > version
> >> > > > >> > >>> among all
> >> > > > >> > >>> > > > brokers.
> >> > > > >> > >>> > > >
> >> > > > >> > >>> > > > Next, think of "client" here as the "user of the
> new
> >> > > feature
> >> > > > >> > >>> versions
> >> > > > >> > >>> > > > system". Imagine that such a client learns about
> >> > finalized
> >> > > > >> feature
> >> > > > >> > >>> > > > versions, and exercises some logic based on the
> >> version.
> >> > > > These
> >> > > > >> > >>> clients
> >> > > > >> > >>> > > can
> >> > > > >> > >>> > > > be of 2 types:
> >> > > > >> > >>> > > > 1. Some part of the broker code itself could behave
> >> > like a
> >> > > > >> client
> >> > > > >> > >>> trying
> >> > > > >> > >>> > > to
> >> > > > >> > >>> > > > use some feature that's "internal" to the broker
> >> > cluster.
> >> > > > >> Such a
> >> > > > >> > >>> client
> >> > > > >> > >>> > > > would learn the latest finalized features via ZK.
> >> > > > >> > >>> > > > 2. An external system (ex: Streams) could behave
> >> like a
> >> > > > >> client,
> >> > > > >> > >>> trying to
> >> > > > >> > >>> > > > use some "external" facing feature. Such a client
> >> would
> >> > > > learn
> >> > > > >> > >>> latest
> >> > > > >> > >>> > > > finalized features via ApiVersionsRequest. Ex:
> >> > > > >> group_coordinator
> >> > > > >> > >>> feature
> >> > > > >> > >>> > > > described in the KIP.
> >> > > > >> > >>> > > >
> >> > > > >> > >>> > > > Next, imagine that for F, the max_version_level is
> >> > > > >> successfully
> >> > > > >> > >>> bumped by
> >> > > > >> > >>> > > > +1 (via Controller API). Now it is guaranteed that
> >> all
> >> > > > brokers
> >> > > > >> > >>> (i.e.
> >> > > > >> > >>> > > > internal clients) understand max_version_level + 1.
> >> > > However,
> >> > > > >> it is
> >> > > > >> > >>> still
> >> > > > >> > >>> > > > not guaranteed that all external clients have
> support
> >> > for
> >> > > > (or
> >> > > > >> have
> >> > > > >> > >>> > > > activated) the logic for the newer version. Why?
> >> > Because,
> >> > > > >> this is
> >> > > > >> > >>> > > > subjective as explained next:
> >> > > > >> > >>> > > >
> >> > > > >> > >>> > > > 1. On one hand, imagine F as an internal feature
> only
> >> > > > >> relevant to
> >> > > > >> > >>> > > Brokers.
> >> > > > >> > >>> > > > The binary for the internal client logic is
> >> controlled
> >> > by
> >> > > > >> Broker
> >> > > > >> > >>> cluster
> >> > > > >> > >>> > > > deployments. When shipping a new Broker release, we
> >> > > wouldn't
> >> > > > >> bump
> >> > > > >> > >>> max
> >> > > > >> > >>> > > > "supported" feature version for F by 1, unless we
> >> have
> >> > > > >> introduced
> >> > > > >> > >>> some
> >> > > > >> > >>> > > new
> >> > > > >> > >>> > > > logic (with a potentially breaking change) in the
> >> > Broker.
> >> > > > >> > >>> Furthermore,
> >> > > > >> > >>> > > such
> >> > > > >> > >>> > > > feature logic in the broker should/will not be
> >> > implemented
> >> > > > in
> >> > > > >> a
> >> > > > >> > >>> way that
> >> > > > >> > >>> > > it
> >> > > > >> > >>> > > > would activate logic for an older feature version
> >> after
> >> > it
> >> > > > has
> >> > > > >> > >>> migrated
> >> > > > >> > >>> > > to
> >> > > > >> > >>> > > > using the logic for a newer feature version
> (because
> >> > this
> >> > > > >> could
> >> > > > >> > >>> break the
> >> > > > >> > >>> > > > cluster!). For these cases, max_version_level will
> be
> >> > very
> >> > > > >> useful
> >> > > > >> > >>> for
> >> > > > >> > >>> > > > decision making.
> >> > > > >> > >>> > > >
> >> > > > >> > >>> > > > 2. On the other hand, imagine F as an external
> facing
> >> > > > feature.
> >> > > > >> > >>> External
> >> > > > >> > >>> > > > clients are not within the control of Broker
> >> cluster. An
> >> > > > >> external
> >> > > > >> > >>> client
> >> > > > >> > >>> > > > may not have upgraded it's code (yet) to use
> >> > > > >> 'max_version_level +
> >> > > > >> > >>> 1'.
> >> > > > >> > >>> > > But,
> >> > > > >> > >>> > > > the Kafka cluster could have been deployed with
> >> support
> >> > > for
> >> > > > >> > >>> > > > 'max_version_level + 1' of an external facing
> >> feature.
> >> > > Now,
> >> > > > >> these
> >> > > > >> > >>> > > external
> >> > > > >> > >>> > > > clients (until an upgrade) are benefitted in
> learning
> >> > > "what
> >> > > > >> is the
> >> > > > >> > >>> lowest
> >> > > > >> > >>> > > > common version for F among all brokers?". This is
> >> where
> >> > > the
> >> > > > >> > >>> > > > "min_version_level" becomes useful. Using this, a
> >> client
> >> > > > could
> >> > > > >> > >>> learn the
> >> > > > >> > >>> > > > specific supported versions that's lower than
> >> > > > >> max_version_level
> >> > > > >> > >>> (instead
> >> > > > >> > >>> > > of
> >> > > > >> > >>> > > > assuming that all brokers support the range: [1,
> >> > > > >> > >>> max_version_level]). For
> >> > > > >> > >>> > > > example, if the cluster deprecates
> >> "min_version_level",
> >> > > then
> >> > > > >> the
> >> > > > >> > >>> client
> >> > > > >> > >>> > > > becomes aware because it periodically learns the
> >> latest
> >> > > > >> > >>> > > "min_version_level"
> >> > > > >> > >>> > > > via ApiVersionsRequest.
> >> > > > >> > >>> > > >
> >> > > > >> > >>> > >
> >> > > > >> > >>> > > Thanks for the explanation.  I agree that this does
> >> make
> >> > > sense
> >> > > > >> when
> >> > > > >> > >>> you
> >> > > > >> > >>> > > take the client perspective into account.
> >> > > > >> > >>> > >
> >> > > > >> > >>> > > best,
> >> > > > >> > >>> > > Colin
> >> > > > >> > >>> > >
> >> > > > >> > >>> > >
> >> > > > >> > >>> > > >
> >> > > > >> > >>> > > > Cheers,
> >> > > > >> > >>> > > > Kowshik
> >> > > > >> > >>> > > >
> >> > > > >> > >>> > > >
> >> > > > >> > >>> > > >
> >> > > > >> > >>> > > >
> >> > > > >> > >>> > > > On Thu, Apr 23, 2020 at 12:07 PM Colin McCabe <
> >> > > > >> cmcc...@apache.org>
> >> > > > >> > >>> > > wrote:
> >> > > > >> > >>> > > >
> >> > > > >> > >>> > > > > Hi Kowshik,
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > Thanks again for working on this-- it looks
> >> great.  I
> >> > > went
> >> > > > >> over
> >> > > > >> > >>> the KIP
> >> > > > >> > >>> > > > > again and have a few more comments.
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > ===
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > It would be good to note that deprecating a
> feature
> >> > > > version
> >> > > > >> (in
> >> > > > >> > >>> other
> >> > > > >> > >>> > > > > words, increasing minVersionLevel on the broker)
> >> is an
> >> > > > >> > >>> incompatible
> >> > > > >> > >>> > > change,
> >> > > > >> > >>> > > > > which requires a major release of Kafka.
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > ===
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > I think the strategy for transitioning from not
> >> > having a
> >> > > > >> > >>> /features
> >> > > > >> > >>> > > znode
> >> > > > >> > >>> > > > > to having one could use some work. The current
> >> > proposal
> >> > > is
> >> > > > >> to
> >> > > > >> > >>> wait for
> >> > > > >> > >>> > > all
> >> > > > >> > >>> > > > > the brokers to fill in their feature znodes and
> >> then
> >> > > pick
> >> > > > >> the
> >> > > > >> > >>> highest
> >> > > > >> > >>> > > > > common versions.  But that requires blocking in
> the
> >> > > > >> controller
> >> > > > >> > >>> startup
> >> > > > >> > >>> > > code
> >> > > > >> > >>> > > > > until the whole cluster is active (technically, a
> >> > point
> >> > > in
> >> > > > >> time
> >> > > > >> > >>> which
> >> > > > >> > >>> > > we
> >> > > > >> > >>> > > > > never really know that we have reached...)
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > Instead, I think it would be better to have a
> >> strategy
> >> > > > like
> >> > > > >> this:
> >> > > > >> > >>> > > > > 1. If the controller comes up and there is no
> >> > /features
> >> > > > >> znode
> >> > > > >> > >>> AND the
> >> > > > >> > >>> > > IBP
> >> > > > >> > >>> > > > > is less than 2.6, create a /features znode where
> >> all
> >> > the
> >> > > > >> > >>> features are
> >> > > > >> > >>> > > > > disabled.
> >> > > > >> > >>> > > > > 2. If the controller comes up and there is no
> >> > /features
> >> > > > >> znode
> >> > > > >> > >>> AND the
> >> > > > >> > >>> > > IBP
> >> > > > >> > >>> > > > > is greater than or equal to 2.6, create a
> /features
> >> > > znode
> >> > > > >> where
> >> > > > >> > >>> all the
> >> > > > >> > >>> > > > > features are enabled at the highest versions
> >> supported
> >> > > by
> >> > > > >> the
> >> > > > >> > >>> > > controller.
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > People upgrading from the pre-KIP-584
> >> > > > >> > >>> <https://issues.apache.org/jira/browse/KIP-584>
> >> > > > >> > >>> > > > > <https://issues.apache.org/jira/browse/KIP-584>
> >> world
> >> > > > will
> >> > > > >> end
> >> > > > >> > >>> up in
> >> > > > >> > >>> > > case
> >> > > > >> > >>> > > > > #1 because they have to do a double roll to
> >> upgrade,
> >> > and
> >> > > > >> during
> >> > > > >> > >>> the
> >> > > > >> > >>> > > first
> >> > > > >> > >>> > > > > roll, the IBP is unchanged.  People creating new
> >> > > clusters
> >> > > > >> from
> >> > > > >> > >>> scratch
> >> > > > >> > >>> > > will
> >> > > > >> > >>> > > > > end up in case #2, which is what we want since we
> >> > don't
> >> > > > >> want a
> >> > > > >> > >>> brand
> >> > > > >> > >>> > > new
> >> > > > >> > >>> > > > > cluster to be using old feature flag versions.
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > ===
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > UpdateFeaturesResponse#ErrorMessage should
> specify
> >> > > > >> > >>> nullableVersions
> >> > > > >> > >>> > > since
> >> > > > >> > >>> > > > > null is a valid value here
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > Also, in the message format, the tags we use need
> >> to
> >> > > start
> >> > > > >> at 0.
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > ===
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > I don't think we need the
> >> FEATURE_UPDATE_IN_PROGRESS
> >> > > error
> >> > > > >> > >>> code.  The
> >> > > > >> > >>> > > > > controller is basically single-threaded and will
> >> only
> >> > do
> >> > > > >> one of
> >> > > > >> > >>> these
> >> > > > >> > >>> > > > > operations at once.  Even if it weren't, though,
> we
> >> > > could
> >> > > > >> simply
> >> > > > >> > >>> block
> >> > > > >> > >>> > > the
> >> > > > >> > >>> > > > > second operation behind the first one.
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > ===
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > For updateFeatures, it would be good to specify
> >> that
> >> > if
> >> > > a
> >> > > > >> single
> >> > > > >> > >>> > > feature
> >> > > > >> > >>> > > > > version update in the batch can't be done, none
> of
> >> > them
> >> > > > are
> >> > > > >> > >>> done.  I
> >> > > > >> > >>> > > think
> >> > > > >> > >>> > > > > this was the intention, but I wasn't able to find
> >> it
> >> > > > >> spelled out
> >> > > > >> > >>> > > (maybe i
> >> > > > >> > >>> > > > > missed it).
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > ===
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > And now, something a little bit bigger (sorry).
> >> For
> >> > > > >> finalized
> >> > > > >> > >>> > > features,
> >> > > > >> > >>> > > > > why do we need both min_version_level and
> >> > > > max_version_level?
> >> > > > >> > >>> Assuming
> >> > > > >> > >>> > > that
> >> > > > >> > >>> > > > > we want all the brokers to be on the same feature
> >> > > version
> >> > > > >> level,
> >> > > > >> > >>> we
> >> > > > >> > >>> > > really
> >> > > > >> > >>> > > > > only care about three numbers for each feature,
> >> right?
> >> > > > The
> >> > > > >> > >>> minimum
> >> > > > >> > >>> > > > > supported version level, the maximum supported
> >> version
> >> > > > >> level,
> >> > > > >> > >>> and the
> >> > > > >> > >>> > > > > current active version level.
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > We don't actually want different brokers to be on
> >> > > > different
> >> > > > >> > >>> versions of
> >> > > > >> > >>> > > > > the same feature, right?  So we can just have one
> >> > number
> >> > > > for
> >> > > > >> > >>> current
> >> > > > >> > >>> > > > > version level, rather than two.  At least that's
> >> what
> >> > I
> >> > > > was
> >> > > > >> > >>> thinking
> >> > > > >> > >>> > > -- let
> >> > > > >> > >>> > > > > me know if I missed something.
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > best,
> >> > > > >> > >>> > > > > Colin
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > > > On Tue, Apr 21, 2020, at 13:01, Dhruvil Shah
> wrote:
> >> > > > >> > >>> > > > > > Thanks for the KIP! +1 (non-binding)
> >> > > > >> > >>> > > > > >
> >> > > > >> > >>> > > > > > On Tue, Apr 21, 2020 at 6:09 AM David Jacot <
> >> > > > >> > >>> dja...@confluent.io>
> >> > > > >> > >>> > > wrote:
> >> > > > >> > >>> > > > > >
> >> > > > >> > >>> > > > > > > Great KIP, thanks! +1 (non-binding)
> >> > > > >> > >>> > > > > > >
> >> > > > >> > >>> > > > > > > On Fri, Apr 17, 2020 at 8:56 PM Guozhang
> Wang <
> >> > > > >> > >>> wangg...@gmail.com>
> >> > > > >> > >>> > > > > wrote:
> >> > > > >> > >>> > > > > > >
> >> > > > >> > >>> > > > > > > > Thanks for the great KIP Kowshik, +1
> >> (binding).
> >> > > > >> > >>> > > > > > > >
> >> > > > >> > >>> > > > > > > > On Fri, Apr 17, 2020 at 11:22 AM Jun Rao <
> >> > > > >> j...@confluent.io
> >> > > > >> > >>> >
> >> > > > >> > >>> > > wrote:
> >> > > > >> > >>> > > > > > > >
> >> > > > >> > >>> > > > > > > > > Hi, Kowshik,
> >> > > > >> > >>> > > > > > > > >
> >> > > > >> > >>> > > > > > > > > Thanks for the KIP. +1
> >> > > > >> > >>> > > > > > > > >
> >> > > > >> > >>> > > > > > > > > Jun
> >> > > > >> > >>> > > > > > > > >
> >> > > > >> > >>> > > > > > > > > On Thu, Apr 16, 2020 at 11:14 AM Kowshik
> >> > > Prakasam
> >> > > > <
> >> > > > >> > >>> > > > > > > > kpraka...@confluent.io>
> >> > > > >> > >>> > > > > > > > > wrote:
> >> > > > >> > >>> > > > > > > > >
> >> > > > >> > >>> > > > > > > > > > Hi all,
> >> > > > >> > >>> > > > > > > > > >
> >> > > > >> > >>> > > > > > > > > > I'd like to start a vote for KIP-584
> >> > > > >> > >>> <https://issues.apache.org/jira/browse/KIP-584>
> >> > > > >> > >>> > > > > <https://issues.apache.org/jira/browse/KIP-584>.
> >> The
> >> > > link
> >> > > > >> to
> >> > > > >> > >>> the KIP
> >> > > > >> > >>> > > can
> >> > > > >> > >>> > > > > be
> >> > > > >> > >>> > > > > > > found
> >> > > > >> > >>> > > > > > > > > > here:
> >> > > > >> > >>> > > > > > > > > >
> >> > > > >> > >>> > > > > > > > > >
> >> > > > >> > >>> > > > > > > > >
> >> > > > >> > >>> > > > > > > >
> >> > > > >> > >>> > > > > > >
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > >
> >> > > > >> > >>>
> >> > > > >>
> >> > > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features
> >> > > > >> > >>> > > > > > > > > > .
> >> > > > >> > >>> > > > > > > > > >
> >> > > > >> > >>> > > > > > > > > > Thanks!
> >> > > > >> > >>> > > > > > > > > >
> >> > > > >> > >>> > > > > > > > > >
> >> > > > >> > >>> > > > > > > > > > Cheers,
> >> > > > >> > >>> > > > > > > > > > Kowshik
> >> > > > >> > >>> > > > > > > > > >
> >> > > > >> > >>> > > > > > > > >
> >> > > > >> > >>> > > > > > > >
> >> > > > >> > >>> > > > > > > >
> >> > > > >> > >>> > > > > > > > --
> >> > > > >> > >>> > > > > > > > -- Guozhang
> >> > > > >> > >>> > > > > > > >
> >> > > > >> > >>> > > > > > >
> >> > > > >> > >>> > > > > >
> >> > > > >> > >>> > > > >
> >> > > > >> > >>> > > >
> >> > > > >> > >>> > >
> >> > > > >> > >>> >
> >> > > > >> > >>>
> >> > > > >> > >>
> >> > > > >> >
> >> > > > >>
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Reply via email to