Hi all,

I wanted to let you know that I have made the following minor changes to
the `kafka-features` CLI tool description in the KIP-584 write up. The
purpose is to ensure the design is correct for a few things which came up
during implementation:

1. The CLI tool now produces a tab-formatted output instead of JSON. This
aligns with the type of format produced by other admin CLI tools of Kafka,
ex: `kafka-topics`.
2. Whenever feature updates are performed, the output of the CLI tool shows
the result of each feature update that was applied.
3. The CLI tool accepts an optional argument `--dry-run` which lets the
user preview the feature updates before applying them.

The following section of the KIP has been updated with the above changes:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP584:Versioningschemeforfeatures-Toolingsupport

Please let me know if you have any questions.


Cheers,
Kowshik


On Thu, Oct 8, 2020 at 1:12 AM Kowshik Prakasam <kpraka...@confluent.io>
wrote:

> Hi Jun,
>
> This is a very good point. I have updated the feature version deprecation
> section mentioning the same:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP584:Versioningschemeforfeatures-Featureversiondeprecation
> .
>
> Thank you for the suggestion.
>
>
> Cheers,
> Kowshik
>
>
> On Tue, Oct 6, 2020 at 5:30 PM Jun Rao <j...@confluent.io> wrote:
>
>> Hi, Kowshik,
>>
>> Thanks for the follow up. Both look good to me.
>>
>> For 2, it would be useful to also add that an admin should make sure that
>> no clients are using a deprecated feature version (e.g. using the client
>> version metric) before deploying a release that deprecates it.
>>
>> Thanks,
>>
>> Jun
>>
>> On Tue, Oct 6, 2020 at 3:46 PM Kowshik Prakasam <kpraka...@confluent.io>
>> wrote:
>>
>> > Hi Jun,
>> >
>> > I have added the following details in the KIP-584 write up:
>> >
>> > 1. Deployment, IBP deprecation and avoidance of double rolls. This
>> section
>> > talks about the various phases of work that would be required to use
>> this
>> > KIP to eventually avoid Broker double rolls in the cluster (whenever IBP
>> > values are advanced). Link to section:
>> >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP584:Versioningschemeforfeatures-Deployment,IBPdeprecationandavoidanceofdoublerolls
>> > .
>> >
>> > 2. Feature version deprecation. This section explains the idea for
>> feature
>> > version deprecation (using highest supported feature min version) which
>> you
>> > had proposed during code review:
>> >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP584:Versioningschemeforfeatures-Featureversiondeprecation
>> > .
>> >
>> > Please let me know if you have any questions.
>> >
>> >
>> > Cheers,
>> > Kowshik
>> >
>> >
>> > On Tue, Sep 29, 2020 at 11:07 AM Jun Rao <j...@confluent.io> wrote:
>> >
>> > > Hi, Kowshik,
>> > >
>> > > Thanks for the update. Regarding enabling a single rolling restart in
>> the
>> > > future, could we sketch out a bit how this will work by treating IBP
>> as a
>> > > feature? For example, IBP currently uses the release version and this
>> KIP
>> > > uses an integer for versions. How do we bridge the gap between the
>> two?
>> > > Does min.version still make sense for IBP as a feature?
>> > >
>> > > Thanks,
>> > >
>> > > Jun
>> > >
>> > > On Fri, Sep 25, 2020 at 5:57 PM Kowshik Prakasam <
>> kpraka...@confluent.io
>> > >
>> > > wrote:
>> > >
>> > > > Hi Colin,
>> > > >
>> > > > Thanks for the feedback. Those are very good points. I have made the
>> > > > following changes to the KIP as you had suggested:
>> > > > 1. Included the `timeoutMs` field in the `UpdateFeaturesRequest`
>> > schema.
>> > > > The initial implementation won't be making use of the field, but we
>> can
>> > > > always use it in the future as the need arises.
>> > > > 2. Modified the `FinalizedFeaturesEpoch` field in
>> `ApiVersionsResponse`
>> > > to
>> > > > use int64. This is to avoid overflow problems in the future once ZK
>> is
>> > > > gone.
>> > > >
>> > > > I have also incorporated these changes into the versioning write
>> path
>> > PR
>> > > > that is currently under review:
>> > > https://github.com/apache/kafka/pull/9001.
>> > > >
>> > > >
>> > > > Cheers,
>> > > > Kowshik
>> > > >
>> > > >
>> > > >
>> > > > On Fri, Sep 25, 2020 at 4:57 PM Kowshik Prakasam <
>> > kpraka...@confluent.io
>> > > >
>> > > > wrote:
>> > > >
>> > > > > Hi Jun,
>> > > > >
>> > > > > Thanks for the feedback. It's a very good point. I have now
>> modified
>> > > the
>> > > > > KIP-584 write-up "goals" section a bit. It now mentions one of the
>> > > goals
>> > > > as
>> > > > > enabling rolling upgrades using a single restart (instead of 2).
>> > Also I
>> > > > > have removed the text explicitly aiming for deprecation of IBP.
>> Note
>> > > that
>> > > > > previously under "Potential features in Kafka" the IBP was
>> mentioned
>> > > > under
>> > > > > point (4) as a possible coarse-grained feature. Hopefully, now
>> the 2
>> > > > > sections of the KIP align with each other well.
>> > > > >
>> > > > >
>> > > > > Cheers,
>> > > > > Kowshik
>> > > > >
>> > > > >
>> > > > > On Fri, Sep 25, 2020 at 2:03 PM Colin McCabe <cmcc...@apache.org>
>> > > wrote:
>> > > > >
>> > > > >> On Tue, Sep 22, 2020, at 00:43, Kowshik Prakasam wrote:
>> > > > >> > Hi all,
>> > > > >> >
>> > > > >> > I wanted to let you know that I have made the following
>> changes to
>> > > the
>> > > > >> > KIP-584 write up. The purpose is to ensure the design is
>> correct
>> > > for a
>> > > > >> few
>> > > > >> > things which came up during implementation:
>> > > > >> >
>> > > > >>
>> > > > >> Hi Kowshik,
>> > > > >>
>> > > > >> Thanks for the updates.
>> > > > >>
>> > > > >> >
>> > > > >> > 1. Per FeatureUpdate error code: The UPDATE_FEATURES controller
>> > API
>> > > is
>> > > > >> no
>> > > > >> > longer transactional. Going forward, we allow for individual
>> > > > >> FeatureUpdate
>> > > > >> > to succeed/fail in the request. As a result, the response
>> schema
>> > now
>> > > > >> > contains an error code per FeatureUpdate as well as a top-level
>> > > error
>> > > > >> code.
>> > > > >> > Overall this is a better design because it better represents
>> the
>> > > > nature
>> > > > >> of
>> > > > >> > the API: each FeatureUpdate in the request is independent of
>> the
>> > > other
>> > > > >> > updates, and the controller can process/apply these
>> independently
>> > to
>> > > > ZK.
>> > > > >> > When an UPDATE_FEATURES request fails, this new design provides
>> > > better
>> > > > >> > clarity to the caller on which FeatureUpdate could not be
>> applied
>> > > (via
>> > > > >> the
>> > > > >> > individual error codes). In the previous design, we were
>> unable to
>> > > > >> achieve
>> > > > >> > such an increased level of clarity in communicating the error
>> > codes.
>> > > > >> >
>> > > > >>
>> > > > >> OK
>> > > > >>
>> > > > >> >
>> > > > >> > 2. Due to #1, there were some minor changes required to the
>> > proposed
>> > > > >> Admin
>> > > > >> > APIs (describeFeatures and updateFeatures). A few unnecessary
>> > public
>> > > > >> APIs
>> > > > >> > have been removed, and couple essential ones have been added.
>> The
>> > > > latest
>> > > > >> > changes now represent the latest design.
>> > > > >> >
>> > > > >> > 3. The timeoutMs field has been removed from the the
>> > UPDATE_FEATURES
>> > > > API
>> > > > >> > request, since it was not found to be required during
>> > > implementation.
>> > > > >> >
>> > > > >>
>> > > > >> Please don't get rid of timeoutMs.  timeoutMs is required if you
>> > want
>> > > to
>> > > > >> implement the ability to timeout the call if the controller can't
>> > get
>> > > > to it
>> > > > >> in time.  This is important for avoiding congestion collapse
>> where
>> > the
>> > > > >> controller collapses under the weight of lots of retries of the
>> same
>> > > > set of
>> > > > >> calls.
>> > > > >>
>> > > > >> We may not be able to do it in the initial implementation, but we
>> > will
>> > > > >> eventually implement this for all the controller-bound RPCs.
>> > > > >>
>> > > > >> > >
>> > > > >> > > 2. Finalized feature version epoch data type has been made
>> to be
>> > > > int32
>> > > > >> > > (instead of int64). The reason is that the epoch value is the
>> > > value
>> > > > >> of ZK
>> > > > >> > > node version, whose data type is int32.
>> > > > >> > >
>> > > > >>
>> > > > >> Sorry, I missed this earlier.  Using 16 bit feature levels seems
>> > fine.
>> > > > >> However, please don't use a 32-bit epoch here.  We deliberately
>> made
>> > > the
>> > > > >> epoch 64 bits to avoid overflow problems in the future once ZK is
>> > > gone.
>> > > > >>
>> > > > >> best,
>> > > > >> Colin
>> > > > >>
>> > > > >> > > 3. Introduced a new 'status' field in the '/features' ZK node
>> > > > schema.
>> > > > >> The
>> > > > >> > > purpose is to implement Colin's earlier point for the
>> strategy
>> > for
>> > > > >> > > transitioning from not having a /features znode to having
>> one.
>> > An
>> > > > >> > > explanation has been provided in the following section of the
>> > KIP
>> > > > >> detailing
>> > > > >> > > the different cases:
>> > > > >> > >
>> > > > >>
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP584:Versioningschemeforfeatures-FeatureZKnodestatus
>> > > > >> > > .
>> > > > >> > >
>> > > > >> > > Please let me know if you have any questions or concerns.
>> > > > >> > >
>> > > > >> > >
>> > > > >> > > Cheers,
>> > > > >> > > Kowshik
>> > > > >> > >
>> > > > >> > >
>> > > > >> > >
>> > > > >> > > Cheers,
>> > > > >> > > Kowshik
>> > > > >> > >
>> > > > >> > > On Tue, Apr 28, 2020 at 11:24 PM Kowshik Prakasam <
>> > > > >> kpraka...@confluent.io>
>> > > > >> > > wrote:
>> > > > >> > >
>> > > > >> > >> Hi all,
>> > > > >> > >>
>> > > > >> > >> This KIP vote has been open for ~12 days. The summary of the
>> > > votes
>> > > > is
>> > > > >> > >> that we have 3 binding votes (Colin, Guozhang, Jun), and 3
>> > > > >> non-binding
>> > > > >> > >> votes (David, Dhruvil, Boyang). Therefore, the KIP vote
>> passes.
>> > > > I'll
>> > > > >> mark
>> > > > >> > >> KIP as accepted and start working on the implementation.
>> > > > >> > >>
>> > > > >> > >> Thanks a lot!
>> > > > >> > >>
>> > > > >> > >>
>> > > > >> > >> Cheers,
>> > > > >> > >> Kowshik
>> > > > >> > >>
>> > > > >> > >> On Mon, Apr 27, 2020 at 12:15 PM Colin McCabe <
>> > > cmcc...@apache.org>
>> > > > >> wrote:
>> > > > >> > >>
>> > > > >> > >>> Thanks, Kowshik.  +1 (binding)
>> > > > >> > >>>
>> > > > >> > >>> best,
>> > > > >> > >>> Colin
>> > > > >> > >>>
>> > > > >> > >>> On Sat, Apr 25, 2020, at 13:20, Kowshik Prakasam wrote:
>> > > > >> > >>> > Hi Colin,
>> > > > >> > >>> >
>> > > > >> > >>> > Thanks for the explanation! I agree with you, and I have
>> > > updated
>> > > > >> the
>> > > > >> > >>> > KIP.
>> > > > >> > >>> > Here is a link to relevant section:
>> > > > >> > >>> >
>> > > > >> > >>>
>> > > > >>
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Controller:ZKnodebootstrapwithdefaultvalues
>> > > > >> > >>> >
>> > > > >> > >>> >
>> > > > >> > >>> > Cheers,
>> > > > >> > >>> > Kowshik
>> > > > >> > >>> >
>> > > > >> > >>> > On Fri, Apr 24, 2020 at 8:50 PM Colin McCabe <
>> > > > cmcc...@apache.org>
>> > > > >> > >>> wrote:
>> > > > >> > >>> >
>> > > > >> > >>> > > On Fri, Apr 24, 2020, at 00:01, Kowshik Prakasam wrote:
>> > > > >> > >>> > > > (Kowshik): Great point! However for case #1, I'm not
>> > sure
>> > > > why
>> > > > >> we
>> > > > >> > >>> need to
>> > > > >> > >>> > > > create a '/features' ZK node with disabled features.
>> > > > Instead,
>> > > > >> do
>> > > > >> > >>> you see
>> > > > >> > >>> > > > any drawback if we just do not create it? i.e. if
>> IBP is
>> > > > less
>> > > > >> than
>> > > > >> > >>> 2.6,
>> > > > >> > >>> > > the
>> > > > >> > >>> > > > controller treats the case as though the versioning
>> > system
>> > > > is
>> > > > >> > >>> completely
>> > > > >> > >>> > > > disabled, and would not create a non-existing
>> > '/features'
>> > > > >> node.
>> > > > >> > >>> > >
>> > > > >> > >>> > > Hi Kowshik,
>> > > > >> > >>> > >
>> > > > >> > >>> > > When the IBP is less than 2.6, but the software has
>> been
>> > > > >> upgraded to
>> > > > >> > >>> a
>> > > > >> > >>> > > state where it supports this KIP, that
>> > > > >> > >>> > >  means the user is upgrading from an earlier version of
>> > the
>> > > > >> > >>> software.  In
>> > > > >> > >>> > > this case, we want to start with all the features
>> disabled
>> > > and
>> > > > >> allow
>> > > > >> > >>> the
>> > > > >> > >>> > > user to enable them when they are ready.
>> > > > >> > >>> > >
>> > > > >> > >>> > > Enabling all the possible features immediately after an
>> > > > upgrade
>> > > > >> > >>> could be
>> > > > >> > >>> > > harmful to the cluster.  On the other hand, for a new
>> > > cluster,
>> > > > >> we do
>> > > > >> > >>> want
>> > > > >> > >>> > > to enable all the possible features immediately . I was
>> > > > >> proposing
>> > > > >> > >>> this as a
>> > > > >> > >>> > > way to distinguish the two cases (since the new cluster
>> > will
>> > > > >> never be
>> > > > >> > >>> > > started with an old IBP).
>> > > > >> > >>> > >
>> > > > >> > >>> > > > Colin MccCabe wrote:
>> > > > >> > >>> > > > > And now, something a little bit bigger (sorry).
>> For
>> > > > >> finalized
>> > > > >> > >>> > > features,
>> > > > >> > >>> > > > > why do we need both min_version_level and
>> > > > max_version_level?
>> > > > >> > >>> Assuming
>> > > > >> > >>> > > that
>> > > > >> > >>> > > > > we want all the brokers to be on the same feature
>> > > version
>> > > > >> level,
>> > > > >> > >>> we
>> > > > >> > >>> > > really only care
>> > > > >> > >>> > > > > about three numbers for each feature, right?  The
>> > > minimum
>> > > > >> > >>> supported
>> > > > >> > >>> > > version
>> > > > >> > >>> > > > > level, the maximum supported version level, and the
>> > > > current
>> > > > >> > >>> active
>> > > > >> > >>> > > version level.
>> > > > >> > >>> > > >
>> > > > >> > >>> > > > > We don't actually want different brokers to be on
>> > > > different
>> > > > >> > >>> versions of
>> > > > >> > >>> > > > > the same feature, right?  So we can just have one
>> > number
>> > > > for
>> > > > >> > >>> current
>> > > > >> > >>> > > > > version level, rather than two.  At least that's
>> what
>> > I
>> > > > was
>> > > > >> > >>> thinking
>> > > > >> > >>> > > -- let
>> > > > >> > >>> > > > > me know if I missed something.
>> > > > >> > >>> > > >
>> > > > >> > >>> > > > (Kowshik): It is my understanding that the "current
>> > active
>> > > > >> version
>> > > > >> > >>> level"
>> > > > >> > >>> > > > that you have mentioned, is the "max_version_level".
>> But
>> > > we
>> > > > >> still
>> > > > >> > >>> > > > maintain/publish both min and max version levels,
>> > because,
>> > > > the
>> > > > >> > >>> detail
>> > > > >> > >>> > > about
>> > > > >> > >>> > > > min level is useful to external clients. This is
>> > described
>> > > > >> below.
>> > > > >> > >>> > > >
>> > > > >> > >>> > > > For any feature F, think of the closed range:
>> > > > >> [min_version_level,
>> > > > >> > >>> > > > max_version_level] as the range of finalized
>> versions,
>> > > > that's
>> > > > >> > >>> guaranteed
>> > > > >> > >>> > > to
>> > > > >> > >>> > > > be supported by all brokers in the cluster.
>> > > > >> > >>> > > >  - "max_version_level" is the finalized highest
>> common
>> > > > version
>> > > > >> > >>> among all
>> > > > >> > >>> > > > brokers,
>> > > > >> > >>> > > >  - "min_version_level" is the finalized lowest common
>> > > > version
>> > > > >> > >>> among all
>> > > > >> > >>> > > > brokers.
>> > > > >> > >>> > > >
>> > > > >> > >>> > > > Next, think of "client" here as the "user of the new
>> > > feature
>> > > > >> > >>> versions
>> > > > >> > >>> > > > system". Imagine that such a client learns about
>> > finalized
>> > > > >> feature
>> > > > >> > >>> > > > versions, and exercises some logic based on the
>> version.
>> > > > These
>> > > > >> > >>> clients
>> > > > >> > >>> > > can
>> > > > >> > >>> > > > be of 2 types:
>> > > > >> > >>> > > > 1. Some part of the broker code itself could behave
>> > like a
>> > > > >> client
>> > > > >> > >>> trying
>> > > > >> > >>> > > to
>> > > > >> > >>> > > > use some feature that's "internal" to the broker
>> > cluster.
>> > > > >> Such a
>> > > > >> > >>> client
>> > > > >> > >>> > > > would learn the latest finalized features via ZK.
>> > > > >> > >>> > > > 2. An external system (ex: Streams) could behave
>> like a
>> > > > >> client,
>> > > > >> > >>> trying to
>> > > > >> > >>> > > > use some "external" facing feature. Such a client
>> would
>> > > > learn
>> > > > >> > >>> latest
>> > > > >> > >>> > > > finalized features via ApiVersionsRequest. Ex:
>> > > > >> group_coordinator
>> > > > >> > >>> feature
>> > > > >> > >>> > > > described in the KIP.
>> > > > >> > >>> > > >
>> > > > >> > >>> > > > Next, imagine that for F, the max_version_level is
>> > > > >> successfully
>> > > > >> > >>> bumped by
>> > > > >> > >>> > > > +1 (via Controller API). Now it is guaranteed that
>> all
>> > > > brokers
>> > > > >> > >>> (i.e.
>> > > > >> > >>> > > > internal clients) understand max_version_level + 1.
>> > > However,
>> > > > >> it is
>> > > > >> > >>> still
>> > > > >> > >>> > > > not guaranteed that all external clients have support
>> > for
>> > > > (or
>> > > > >> have
>> > > > >> > >>> > > > activated) the logic for the newer version. Why?
>> > Because,
>> > > > >> this is
>> > > > >> > >>> > > > subjective as explained next:
>> > > > >> > >>> > > >
>> > > > >> > >>> > > > 1. On one hand, imagine F as an internal feature only
>> > > > >> relevant to
>> > > > >> > >>> > > Brokers.
>> > > > >> > >>> > > > The binary for the internal client logic is
>> controlled
>> > by
>> > > > >> Broker
>> > > > >> > >>> cluster
>> > > > >> > >>> > > > deployments. When shipping a new Broker release, we
>> > > wouldn't
>> > > > >> bump
>> > > > >> > >>> max
>> > > > >> > >>> > > > "supported" feature version for F by 1, unless we
>> have
>> > > > >> introduced
>> > > > >> > >>> some
>> > > > >> > >>> > > new
>> > > > >> > >>> > > > logic (with a potentially breaking change) in the
>> > Broker.
>> > > > >> > >>> Furthermore,
>> > > > >> > >>> > > such
>> > > > >> > >>> > > > feature logic in the broker should/will not be
>> > implemented
>> > > > in
>> > > > >> a
>> > > > >> > >>> way that
>> > > > >> > >>> > > it
>> > > > >> > >>> > > > would activate logic for an older feature version
>> after
>> > it
>> > > > has
>> > > > >> > >>> migrated
>> > > > >> > >>> > > to
>> > > > >> > >>> > > > using the logic for a newer feature version (because
>> > this
>> > > > >> could
>> > > > >> > >>> break the
>> > > > >> > >>> > > > cluster!). For these cases, max_version_level will be
>> > very
>> > > > >> useful
>> > > > >> > >>> for
>> > > > >> > >>> > > > decision making.
>> > > > >> > >>> > > >
>> > > > >> > >>> > > > 2. On the other hand, imagine F as an external facing
>> > > > feature.
>> > > > >> > >>> External
>> > > > >> > >>> > > > clients are not within the control of Broker
>> cluster. An
>> > > > >> external
>> > > > >> > >>> client
>> > > > >> > >>> > > > may not have upgraded it's code (yet) to use
>> > > > >> 'max_version_level +
>> > > > >> > >>> 1'.
>> > > > >> > >>> > > But,
>> > > > >> > >>> > > > the Kafka cluster could have been deployed with
>> support
>> > > for
>> > > > >> > >>> > > > 'max_version_level + 1' of an external facing
>> feature.
>> > > Now,
>> > > > >> these
>> > > > >> > >>> > > external
>> > > > >> > >>> > > > clients (until an upgrade) are benefitted in learning
>> > > "what
>> > > > >> is the
>> > > > >> > >>> lowest
>> > > > >> > >>> > > > common version for F among all brokers?". This is
>> where
>> > > the
>> > > > >> > >>> > > > "min_version_level" becomes useful. Using this, a
>> client
>> > > > could
>> > > > >> > >>> learn the
>> > > > >> > >>> > > > specific supported versions that's lower than
>> > > > >> max_version_level
>> > > > >> > >>> (instead
>> > > > >> > >>> > > of
>> > > > >> > >>> > > > assuming that all brokers support the range: [1,
>> > > > >> > >>> max_version_level]). For
>> > > > >> > >>> > > > example, if the cluster deprecates
>> "min_version_level",
>> > > then
>> > > > >> the
>> > > > >> > >>> client
>> > > > >> > >>> > > > becomes aware because it periodically learns the
>> latest
>> > > > >> > >>> > > "min_version_level"
>> > > > >> > >>> > > > via ApiVersionsRequest.
>> > > > >> > >>> > > >
>> > > > >> > >>> > >
>> > > > >> > >>> > > Thanks for the explanation.  I agree that this does
>> make
>> > > sense
>> > > > >> when
>> > > > >> > >>> you
>> > > > >> > >>> > > take the client perspective into account.
>> > > > >> > >>> > >
>> > > > >> > >>> > > best,
>> > > > >> > >>> > > Colin
>> > > > >> > >>> > >
>> > > > >> > >>> > >
>> > > > >> > >>> > > >
>> > > > >> > >>> > > > Cheers,
>> > > > >> > >>> > > > Kowshik
>> > > > >> > >>> > > >
>> > > > >> > >>> > > >
>> > > > >> > >>> > > >
>> > > > >> > >>> > > >
>> > > > >> > >>> > > > On Thu, Apr 23, 2020 at 12:07 PM Colin McCabe <
>> > > > >> cmcc...@apache.org>
>> > > > >> > >>> > > wrote:
>> > > > >> > >>> > > >
>> > > > >> > >>> > > > > Hi Kowshik,
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > Thanks again for working on this-- it looks
>> great.  I
>> > > went
>> > > > >> over
>> > > > >> > >>> the KIP
>> > > > >> > >>> > > > > again and have a few more comments.
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > ===
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > It would be good to note that deprecating a feature
>> > > > version
>> > > > >> (in
>> > > > >> > >>> other
>> > > > >> > >>> > > > > words, increasing minVersionLevel on the broker)
>> is an
>> > > > >> > >>> incompatible
>> > > > >> > >>> > > change,
>> > > > >> > >>> > > > > which requires a major release of Kafka.
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > ===
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > I think the strategy for transitioning from not
>> > having a
>> > > > >> > >>> /features
>> > > > >> > >>> > > znode
>> > > > >> > >>> > > > > to having one could use some work. The current
>> > proposal
>> > > is
>> > > > >> to
>> > > > >> > >>> wait for
>> > > > >> > >>> > > all
>> > > > >> > >>> > > > > the brokers to fill in their feature znodes and
>> then
>> > > pick
>> > > > >> the
>> > > > >> > >>> highest
>> > > > >> > >>> > > > > common versions.  But that requires blocking in the
>> > > > >> controller
>> > > > >> > >>> startup
>> > > > >> > >>> > > code
>> > > > >> > >>> > > > > until the whole cluster is active (technically, a
>> > point
>> > > in
>> > > > >> time
>> > > > >> > >>> which
>> > > > >> > >>> > > we
>> > > > >> > >>> > > > > never really know that we have reached...)
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > Instead, I think it would be better to have a
>> strategy
>> > > > like
>> > > > >> this:
>> > > > >> > >>> > > > > 1. If the controller comes up and there is no
>> > /features
>> > > > >> znode
>> > > > >> > >>> AND the
>> > > > >> > >>> > > IBP
>> > > > >> > >>> > > > > is less than 2.6, create a /features znode where
>> all
>> > the
>> > > > >> > >>> features are
>> > > > >> > >>> > > > > disabled.
>> > > > >> > >>> > > > > 2. If the controller comes up and there is no
>> > /features
>> > > > >> znode
>> > > > >> > >>> AND the
>> > > > >> > >>> > > IBP
>> > > > >> > >>> > > > > is greater than or equal to 2.6, create a /features
>> > > znode
>> > > > >> where
>> > > > >> > >>> all the
>> > > > >> > >>> > > > > features are enabled at the highest versions
>> supported
>> > > by
>> > > > >> the
>> > > > >> > >>> > > controller.
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > People upgrading from the pre-KIP-584
>> > > > >> > >>> <https://issues.apache.org/jira/browse/KIP-584>
>> > > > >> > >>> > > > > <https://issues.apache.org/jira/browse/KIP-584>
>> world
>> > > > will
>> > > > >> end
>> > > > >> > >>> up in
>> > > > >> > >>> > > case
>> > > > >> > >>> > > > > #1 because they have to do a double roll to
>> upgrade,
>> > and
>> > > > >> during
>> > > > >> > >>> the
>> > > > >> > >>> > > first
>> > > > >> > >>> > > > > roll, the IBP is unchanged.  People creating new
>> > > clusters
>> > > > >> from
>> > > > >> > >>> scratch
>> > > > >> > >>> > > will
>> > > > >> > >>> > > > > end up in case #2, which is what we want since we
>> > don't
>> > > > >> want a
>> > > > >> > >>> brand
>> > > > >> > >>> > > new
>> > > > >> > >>> > > > > cluster to be using old feature flag versions.
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > ===
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > UpdateFeaturesResponse#ErrorMessage should specify
>> > > > >> > >>> nullableVersions
>> > > > >> > >>> > > since
>> > > > >> > >>> > > > > null is a valid value here
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > Also, in the message format, the tags we use need
>> to
>> > > start
>> > > > >> at 0.
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > ===
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > I don't think we need the
>> FEATURE_UPDATE_IN_PROGRESS
>> > > error
>> > > > >> > >>> code.  The
>> > > > >> > >>> > > > > controller is basically single-threaded and will
>> only
>> > do
>> > > > >> one of
>> > > > >> > >>> these
>> > > > >> > >>> > > > > operations at once.  Even if it weren't, though, we
>> > > could
>> > > > >> simply
>> > > > >> > >>> block
>> > > > >> > >>> > > the
>> > > > >> > >>> > > > > second operation behind the first one.
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > ===
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > For updateFeatures, it would be good to specify
>> that
>> > if
>> > > a
>> > > > >> single
>> > > > >> > >>> > > feature
>> > > > >> > >>> > > > > version update in the batch can't be done, none of
>> > them
>> > > > are
>> > > > >> > >>> done.  I
>> > > > >> > >>> > > think
>> > > > >> > >>> > > > > this was the intention, but I wasn't able to find
>> it
>> > > > >> spelled out
>> > > > >> > >>> > > (maybe i
>> > > > >> > >>> > > > > missed it).
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > ===
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > And now, something a little bit bigger (sorry).
>> For
>> > > > >> finalized
>> > > > >> > >>> > > features,
>> > > > >> > >>> > > > > why do we need both min_version_level and
>> > > > max_version_level?
>> > > > >> > >>> Assuming
>> > > > >> > >>> > > that
>> > > > >> > >>> > > > > we want all the brokers to be on the same feature
>> > > version
>> > > > >> level,
>> > > > >> > >>> we
>> > > > >> > >>> > > really
>> > > > >> > >>> > > > > only care about three numbers for each feature,
>> right?
>> > > > The
>> > > > >> > >>> minimum
>> > > > >> > >>> > > > > supported version level, the maximum supported
>> version
>> > > > >> level,
>> > > > >> > >>> and the
>> > > > >> > >>> > > > > current active version level.
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > We don't actually want different brokers to be on
>> > > > different
>> > > > >> > >>> versions of
>> > > > >> > >>> > > > > the same feature, right?  So we can just have one
>> > number
>> > > > for
>> > > > >> > >>> current
>> > > > >> > >>> > > > > version level, rather than two.  At least that's
>> what
>> > I
>> > > > was
>> > > > >> > >>> thinking
>> > > > >> > >>> > > -- let
>> > > > >> > >>> > > > > me know if I missed something.
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > best,
>> > > > >> > >>> > > > > Colin
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > > > On Tue, Apr 21, 2020, at 13:01, Dhruvil Shah wrote:
>> > > > >> > >>> > > > > > Thanks for the KIP! +1 (non-binding)
>> > > > >> > >>> > > > > >
>> > > > >> > >>> > > > > > On Tue, Apr 21, 2020 at 6:09 AM David Jacot <
>> > > > >> > >>> dja...@confluent.io>
>> > > > >> > >>> > > wrote:
>> > > > >> > >>> > > > > >
>> > > > >> > >>> > > > > > > Great KIP, thanks! +1 (non-binding)
>> > > > >> > >>> > > > > > >
>> > > > >> > >>> > > > > > > On Fri, Apr 17, 2020 at 8:56 PM Guozhang Wang <
>> > > > >> > >>> wangg...@gmail.com>
>> > > > >> > >>> > > > > wrote:
>> > > > >> > >>> > > > > > >
>> > > > >> > >>> > > > > > > > Thanks for the great KIP Kowshik, +1
>> (binding).
>> > > > >> > >>> > > > > > > >
>> > > > >> > >>> > > > > > > > On Fri, Apr 17, 2020 at 11:22 AM Jun Rao <
>> > > > >> j...@confluent.io
>> > > > >> > >>> >
>> > > > >> > >>> > > wrote:
>> > > > >> > >>> > > > > > > >
>> > > > >> > >>> > > > > > > > > Hi, Kowshik,
>> > > > >> > >>> > > > > > > > >
>> > > > >> > >>> > > > > > > > > Thanks for the KIP. +1
>> > > > >> > >>> > > > > > > > >
>> > > > >> > >>> > > > > > > > > Jun
>> > > > >> > >>> > > > > > > > >
>> > > > >> > >>> > > > > > > > > On Thu, Apr 16, 2020 at 11:14 AM Kowshik
>> > > Prakasam
>> > > > <
>> > > > >> > >>> > > > > > > > kpraka...@confluent.io>
>> > > > >> > >>> > > > > > > > > wrote:
>> > > > >> > >>> > > > > > > > >
>> > > > >> > >>> > > > > > > > > > Hi all,
>> > > > >> > >>> > > > > > > > > >
>> > > > >> > >>> > > > > > > > > > I'd like to start a vote for KIP-584
>> > > > >> > >>> <https://issues.apache.org/jira/browse/KIP-584>
>> > > > >> > >>> > > > > <https://issues.apache.org/jira/browse/KIP-584>.
>> The
>> > > link
>> > > > >> to
>> > > > >> > >>> the KIP
>> > > > >> > >>> > > can
>> > > > >> > >>> > > > > be
>> > > > >> > >>> > > > > > > found
>> > > > >> > >>> > > > > > > > > > here:
>> > > > >> > >>> > > > > > > > > >
>> > > > >> > >>> > > > > > > > > >
>> > > > >> > >>> > > > > > > > >
>> > > > >> > >>> > > > > > > >
>> > > > >> > >>> > > > > > >
>> > > > >> > >>> > > > >
>> > > > >> > >>> > >
>> > > > >> > >>>
>> > > > >>
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features
>> > > > >> > >>> > > > > > > > > > .
>> > > > >> > >>> > > > > > > > > >
>> > > > >> > >>> > > > > > > > > > Thanks!
>> > > > >> > >>> > > > > > > > > >
>> > > > >> > >>> > > > > > > > > >
>> > > > >> > >>> > > > > > > > > > Cheers,
>> > > > >> > >>> > > > > > > > > > Kowshik
>> > > > >> > >>> > > > > > > > > >
>> > > > >> > >>> > > > > > > > >
>> > > > >> > >>> > > > > > > >
>> > > > >> > >>> > > > > > > >
>> > > > >> > >>> > > > > > > > --
>> > > > >> > >>> > > > > > > > -- Guozhang
>> > > > >> > >>> > > > > > > >
>> > > > >> > >>> > > > > > >
>> > > > >> > >>> > > > > >
>> > > > >> > >>> > > > >
>> > > > >> > >>> > > >
>> > > > >> > >>> > >
>> > > > >> > >>> >
>> > > > >> > >>>
>> > > > >> > >>
>> > > > >> >
>> > > > >>
>> > > > >
>> > > >
>> > >
>> >
>>
>

Reply via email to