Hi Jun,

Thank you for the suggestion! I have updated the KIP, please find my
response below.

> 200. I guess you are saying only when the allowDowngrade field is set, the
> finalized feature version can go backward. Otherwise, it can only go up.
> That makes sense. It would be useful to make that clear when explaining
> the usage of the allowDowngrade field. In the validation section, we
have  "
> /features' from {"max_version_level": X} to {"max_version_level": X’}", it
> seems that we need to mention Y there.

(Kowshik): Great point! Yes, that is correct. Done, I have updated the
validations
section explaining the above. Here is a link to this section:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Validations


Cheers,
Kowshik




On Wed, Apr 15, 2020 at 11:05 AM Jun Rao <j...@confluent.io> wrote:

> Hi, Kowshik,
>
> 200. I guess you are saying only when the allowDowngrade field is set, the
> finalized feature version can go backward. Otherwise, it can only go up.
> That makes sense. It would be useful to make that clear when explaining
> the usage of the allowDowngrade field. In the validation section, we have
> "
> /features' from {"max_version_level": X} to {"max_version_level": X’}", it
> seems that we need to mention Y there.
>
> Thanks,
>
> Jun
>
> On Wed, Apr 15, 2020 at 10:44 AM Kowshik Prakasam <kpraka...@confluent.io>
> wrote:
>
> > Hi Jun,
> >
> > Great question! Please find my response below.
> >
> > > 200. My understanding is that If the CLI tool passes the
> > > '--allow-downgrade' flag when updating a specific feature, then a
> future
> > > downgrade is possible. Otherwise, the feature is now downgradable. If
> so,
> > I
> > > was wondering how the controller remembers this since it can be
> restarted
> > > over time?
> >
> > (Kowshik): The purpose of the flag was to just restrict the user intent
> for
> > a specific request.
> > It seems to me that to avoid confusion, I could call the flag as
> > `--try-downgrade` instead.
> > Then this makes it clear, that, the controller just has to consider the
> ask
> > from
> > the user as an explicit request to attempt a downgrade.
> >
> > The flag does not act as an override on controller's decision making that
> > decides whether
> > a flag is downgradable (these decisions on whether to allow a flag to be
> > downgraded
> > from a specific version level, can be embedded in the controller code).
> >
> > Please let me know what you think.
> > Sorry if I misunderstood the original question.
> >
> >
> > Cheers,
> > Kowshik
> >
> >
> > On Wed, Apr 15, 2020 at 9:40 AM Jun Rao <j...@confluent.io> wrote:
> >
> > > Hi, Kowshik,
> > >
> > > Thanks for the reply. Makes sense. Just one more question.
> > >
> > > 200. My understanding is that If the CLI tool passes the
> > > '--allow-downgrade' flag when updating a specific feature, then a
> future
> > > downgrade is possible. Otherwise, the feature is now downgradable. If
> > so, I
> > > was wondering how the controller remembers this since it can be
> restarted
> > > over time?
> > >
> > > Jun
> > >
> > >
> > > On Tue, Apr 14, 2020 at 6:49 PM Kowshik Prakasam <
> kpraka...@confluent.io
> > >
> > > wrote:
> > >
> > > > Hi Jun,
> > > >
> > > > Thanks a lot for the feedback and the questions!
> > > > Please find my response below.
> > > >
> > > > > 200. The UpdateFeaturesRequest includes an AllowDowngrade field. It
> > > seems
> > > > > that field needs to be persisted somewhere in ZK?
> > > >
> > > > (Kowshik): Great question! Below is my explanation. Please help me
> > > > understand,
> > > > if you feel there are cases where we would need to still persist it
> in
> > > ZK.
> > > >
> > > > Firstly I have updated my thoughts into the KIP now, under the
> > > 'guidelines'
> > > > section:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Guidelinesonfeatureversionsandworkflows
> > > >
> > > > The allowDowngrade boolean field is just to restrict the user intent,
> > and
> > > > to remind
> > > > them to double check their intent before proceeding. It should be set
> > to
> > > > true
> > > > by the user in a request, only when the user intent is to forcefully
> > > > "attempt" a
> > > > downgrade of a specific feature's max version level, to the provided
> > > value
> > > > in
> > > > the request.
> > > >
> > > > We can extend this safeguard. The controller (on it's end) can
> maintain
> > > > rules in the code, that, for safety reasons would outright reject
> > certain
> > > > downgrades
> > > > from a specific max_version_level for a specific feature. Such
> > rejections
> > > > may
> > > > happen depending on the feature being downgraded, and from what
> version
> > > > level.
> > > >
> > > > The CLI tool only allows a downgrade attempt in conjunction with
> > specific
> > > > flags and sub-commands. For example, in the CLI tool, if the user
> uses
> > > the
> > > > 'downgrade-all' command, or passes '--allow-downgrade' flag when
> > > updating a
> > > > specific feature, only then the tool will translate this ask to
> setting
> > > > 'allowDowngrade' field in the request to the server.
> > > >
> > > > > 201. UpdateFeaturesResponse has the following top level fields.
> > Should
> > > > > those fields be per feature?
> > > > >
> > > > >   "fields": [
> > > > >     { "name": "ErrorCode", "type": "int16", "versions": "0+",
> > > > >       "about": "The error code, or 0 if there was no error." },
> > > > >     { "name": "ErrorMessage", "type": "string", "versions": "0+",
> > > > >       "about": "The error message, or null if there was no error."
> }
> > > > >   ]
> > > >
> > > > (Kowshik): Great question!
> > > > As such, the API is transactional, as explained in the sections
> linked
> > > > below.
> > > > Either all provided FeatureUpdate was applied, or none.
> > > > It's the reason I felt we can have just one error code + message.
> > > > Happy to extend this if you feel otherwise. Please let me know.
> > > >
> > > > Link to sections:
> > > >
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-ChangestoKafkaController
> > > >
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Guarantees
> > > >
> > > > > 202. The /features path in ZK has a field min_version_level. Which
> > API
> > > > and
> > > > > tool can change that value?
> > > >
> > > > (Kowshik): Great question! Currently this cannot be modified by using
> > the
> > > > API or the tool.
> > > > Feature version deprecation (by raising min_version_level) can be
> done
> > > only
> > > > by the Controller directly. The rationale is explained in this
> section:
> > > >
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Featureversiondeprecation
> > > >
> > > >
> > > > Cheers,
> > > > Kowshik
> > > >
> > > > On Tue, Apr 14, 2020 at 5:33 PM Jun Rao <j...@confluent.io> wrote:
> > > >
> > > > > Hi, Kowshik,
> > > > >
> > > > > Thanks for addressing those comments. Just a few more minor
> comments.
> > > > >
> > > > > 200. The UpdateFeaturesRequest includes an AllowDowngrade field. It
> > > seems
> > > > > that field needs to be persisted somewhere in ZK?
> > > > >
> > > > > 201. UpdateFeaturesResponse has the following top level fields.
> > Should
> > > > > those fields be per feature?
> > > > >
> > > > >   "fields": [
> > > > >     { "name": "ErrorCode", "type": "int16", "versions": "0+",
> > > > >       "about": "The error code, or 0 if there was no error." },
> > > > >     { "name": "ErrorMessage", "type": "string", "versions": "0+",
> > > > >       "about": "The error message, or null if there was no error."
> }
> > > > >   ]
> > > > >
> > > > > 202. The /features path in ZK has a field min_version_level. Which
> > API
> > > > and
> > > > > tool can change that value?
> > > > >
> > > > > Jun
> > > > >
> > > > > On Mon, Apr 13, 2020 at 5:12 PM Kowshik Prakasam <
> > > kpraka...@confluent.io
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Jun,
> > > > > >
> > > > > > Thanks for the feedback! I have updated the KIP-584 addressing
> your
> > > > > > comments.
> > > > > > Please find my response below.
> > > > > >
> > > > > > > 100.6 You can look for the sentence "This operation requires
> > ALTER
> > > on
> > > > > > > CLUSTER." in KIP-455. Also, you can check its usage in
> > > > > > > KafkaApis.authorize().
> > > > > >
> > > > > > (Kowshik): Done. Great point! For the newly introduced
> > > UPDATE_FEATURES
> > > > > api,
> > > > > > I have added a
> > > > > > requirement that AclOperation.ALTER is required on
> > > > ResourceType.CLUSTER.
> > > > > >
> > > > > > > 110. Keeping the feature version as int is probably fine. I
> just
> > > felt
> > > > > > that
> > > > > > > for some of the common user interactions, it's more convenient
> to
> > > > > > > relate that to a release version. For example, if a user wants
> to
> > > > > > downgrade
> > > > > > > to a release 2.5, it's easier for the user to use the tool like
> > > "tool
> > > > > > > --downgrade 2.5" instead of "tool --downgrade --feature X
> > --version
> > > > 6".
> > > > > >
> > > > > > (Kowshik): Great point. Generally, maximum feature version levels
> > are
> > > > not
> > > > > > downgradable after
> > > > > > they are finalized in the cluster. This is because, as a
> guideline
> > > > > bumping
> > > > > > feature version level usually is used mainly to convey important
> > > > breaking
> > > > > > changes.
> > > > > > Despite the above, there may be some extreme/rare cases where a
> > user
> > > > > wants
> > > > > > to downgrade
> > > > > > all features to a specific previous release. The user may want to
> > do
> > > > this
> > > > > > just
> > > > > > prior to rolling back a Kafka cluster to a previous release.
> > > > > >
> > > > > > To support the above, I have made a change to the KIP explaining
> > that
> > > > the
> > > > > > CLI tool is versioned.
> > > > > > The CLI tool internally has knowledge about a map of features to
> > > their
> > > > > > respective max
> > > > > > versions supported by the Broker. The tool's knowledge of
> features
> > > and
> > > > > > their version values,
> > > > > > is limited to the version of the CLI tool itself i.e. the
> > information
> > > > is
> > > > > > packaged into the CLI tool
> > > > > > when it is released. Whenever a Kafka release introduces a new
> > > feature
> > > > > > version, or modifies
> > > > > > an existing feature version, the CLI tool shall also be updated
> > with
> > > > this
> > > > > > information,
> > > > > > Newer versions of the CLI tool will be released as part of the
> > Kafka
> > > > > > releases.
> > > > > >
> > > > > > Therefore, to achieve the downgrade need, the user just needs to
> > run
> > > > the
> > > > > > version of
> > > > > > the CLI tool that's part of the particular previous release that
> > > he/she
> > > > > is
> > > > > > downgrading to.
> > > > > > To help the user with this, there is a new command added to the
> CLI
> > > > tool
> > > > > > called `downgrade-all`.
> > > > > > This essentially downgrades max version levels of all features in
> > the
> > > > > > cluster to the versions
> > > > > > known to the CLI tool internally.
> > > > > >
> > > > > > I have explained the above in the KIP under these sections:
> > > > > >
> > > > > > Tooling support (have explained that the CLI tool is versioned):
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Toolingsupport
> > > > > >
> > > > > > Regular CLI tool usage (please refer to point #3, and see the
> > tooling
> > > > > > example)
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-RegularCLItoolusage
> > > > > >
> > > > > > > 110. Similarly, if the client library finds a feature mismatch
> > with
> > > > the
> > > > > > broker,
> > > > > > > the client likely needs to log some error message for the user
> to
> > > > take
> > > > > > some
> > > > > > > actions. It's much more actionable if the error message is
> > "upgrade
> > > > the
> > > > > > > broker to release version 2.6" than just "upgrade the broker to
> > > > feature
> > > > > > > version 7".
> > > > > >
> > > > > > (Kowshik): That's a really good point! If we use ints for feature
> > > > > versions,
> > > > > > the best
> > > > > > message that client can print for debugging is "broker doesn't
> > > support
> > > > > > feature version 7", and alongside that print the supported
> version
> > > > range
> > > > > > returned
> > > > > > by the broker. Then, does it sound reasonable that the user could
> > > then
> > > > > > reference
> > > > > > Kafka release logs to figure out which version of the broker
> > release
> > > is
> > > > > > required
> > > > > > be deployed, to support feature version 7? I couldn't think of a
> > > better
> > > > > > strategy here.
> > > > > >
> > > > > > > 120. When should a developer bump up the version of a feature?
> > > > > >
> > > > > > (Kowshik): Great question! In the KIP, I have added a section:
> > > > > 'Guidelines
> > > > > > on feature versions and workflows'
> > > > > > providing some guidelines on when to use the versioned feature
> > flags,
> > > > and
> > > > > > what
> > > > > > are the regular workflows with the CLI tool.
> > > > > >
> > > > > > Link to the relevant sections:
> > > > > > Guidelines:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Guidelinesonfeatureversionsandworkflows
> > > > > >
> > > > > > Regular CLI tool usage:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-RegularCLItoolusage
> > > > > >
> > > > > > Advanced CLI tool usage:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-AdvancedCLItoolusage
> > > > > >
> > > > > >
> > > > > > Cheers,
> > > > > > Kowshik
> > > > > >
> > > > > >
> > > > > > On Fri, Apr 10, 2020 at 4:25 PM Jun Rao <j...@confluent.io>
> wrote:
> > > > > >
> > > > > > > Hi, Kowshik,
> > > > > > >
> > > > > > > Thanks for the reply. A few more comments.
> > > > > > >
> > > > > > > 110. Keeping the feature version as int is probably fine. I
> just
> > > felt
> > > > > > that
> > > > > > > for some of the common user interactions, it's more convenient
> to
> > > > > > > relate that to a release version. For example, if a user wants
> to
> > > > > > downgrade
> > > > > > > to a release 2.5, it's easier for the user to use the tool like
> > > "tool
> > > > > > > --downgrade 2.5" instead of "tool --downgrade --feature X
> > --version
> > > > 6".
> > > > > > > Similarly, if the client library finds a feature mismatch with
> > the
> > > > > > broker,
> > > > > > > the client likely needs to log some error message for the user
> to
> > > > take
> > > > > > some
> > > > > > > actions. It's much more actionable if the error message is
> > "upgrade
> > > > the
> > > > > > > broker to release version 2.6" than just "upgrade the broker to
> > > > feature
> > > > > > > version 7".
> > > > > > >
> > > > > > > 111. Sounds good.
> > > > > > >
> > > > > > > 120. When should a developer bump up the version of a feature?
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > > On Tue, Apr 7, 2020 at 12:26 AM Kowshik Prakasam <
> > > > > kpraka...@confluent.io
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Jun,
> > > > > > > >
> > > > > > > > I have updated the KIP for the item 111.
> > > > > > > > I'm in the process of addressing 100.6, and will provide an
> > > update
> > > > > > soon.
> > > > > > > > I think item 110 is still under discussion given we are now
> > > > > providing a
> > > > > > > way
> > > > > > > > to finalize
> > > > > > > > all features to their latest version levels. In any case,
> > please
> > > > let
> > > > > us
> > > > > > > > know
> > > > > > > > how you feel in response to Colin's comments on this topic.
> > > > > > > >
> > > > > > > > > 111. To put this in context, when we had IBP, the default
> > value
> > > > is
> > > > > > the
> > > > > > > > > current released version. So, if you are a brand new user,
> > you
> > > > > don't
> > > > > > > need
> > > > > > > > > to configure IBP and all new features will be immediately
> > > > available
> > > > > > in
> > > > > > > > the
> > > > > > > > > new cluster. If you are upgrading from an old version, you
> do
> > > > need
> > > > > to
> > > > > > > > > understand and configure IBP. I see a similar pattern here
> > for
> > > > > > > > > features. From the ease of use perspective, ideally, we
> > > shouldn't
> > > > > > > require
> > > > > > > > a
> > > > > > > > > new user to have an extra step such as running a bootstrap
> > > script
> > > > > > > unless
> > > > > > > > > it's truly necessary. If someone has a special need (all
> the
> > > > cases
> > > > > > you
> > > > > > > > > mentioned seem special cases?), they can configure a mode
> > such
> > > > that
> > > > > > > > > features are enabled/disabled manually.
> > > > > > > >
> > > > > > > > (Kowshik): That makes sense, thanks for the idea! Sorry if I
> > > didn't
> > > > > > > > understand
> > > > > > > > this need earlier. I have updated the KIP with the approach
> > that
> > > > > > whenever
> > > > > > > > the '/features' node is absent, the controller by default
> will
> > > > > > bootstrap
> > > > > > > > the node
> > > > > > > > to contain the latest feature levels. Here is the new section
> > in
> > > > the
> > > > > > KIP
> > > > > > > > describing
> > > > > > > > the same:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Controller:ZKnodebootstrapwithdefaultvalues
> > > > > > > >
> > > > > > > > Next, as I explained in my response to Colin's suggestions,
> we
> > > are
> > > > > now
> > > > > > > > providing a `--finalize-latest-features` flag with the
> tooling.
> > > > This
> > > > > > lets
> > > > > > > > the sysadmin finalize all features known to the controller to
> > > their
> > > > > > > latest
> > > > > > > > version
> > > > > > > > levels. Please look at this section (point #3 and the tooling
> > > > example
> > > > > > > > later):
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Toolingsupport
> > > > > > > >
> > > > > > > >
> > > > > > > > Do you feel this addresses your comment/concern?
> > > > > > > >
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Kowshik
> > > > > > > >
> > > > > > > > On Mon, Apr 6, 2020 at 12:06 PM Jun Rao <j...@confluent.io>
> > > wrote:
> > > > > > > >
> > > > > > > > > Hi, Kowshik,
> > > > > > > > >
> > > > > > > > > Thanks for the reply. A few more replies below.
> > > > > > > > >
> > > > > > > > > 100.6 You can look for the sentence "This operation
> requires
> > > > ALTER
> > > > > on
> > > > > > > > > CLUSTER." in KIP-455. Also, you can check its usage in
> > > > > > > > > KafkaApis.authorize().
> > > > > > > > >
> > > > > > > > > 110. From the external client/tooling perspective, it's
> more
> > > > > natural
> > > > > > to
> > > > > > > > use
> > > > > > > > > the release version for features. If we can use the same
> > > release
> > > > > > > version
> > > > > > > > > for internal representation, it seems simpler (easier to
> > > > > understand,
> > > > > > no
> > > > > > > > > mapping overhead, etc). Is there a benefit with separate
> > > external
> > > > > and
> > > > > > > > > internal versioning schemes?
> > > > > > > > >
> > > > > > > > > 111. To put this in context, when we had IBP, the default
> > value
> > > > is
> > > > > > the
> > > > > > > > > current released version. So, if you are a brand new user,
> > you
> > > > > don't
> > > > > > > need
> > > > > > > > > to configure IBP and all new features will be immediately
> > > > available
> > > > > > in
> > > > > > > > the
> > > > > > > > > new cluster. If you are upgrading from an old version, you
> do
> > > > need
> > > > > to
> > > > > > > > > understand and configure IBP. I see a similar pattern here
> > for
> > > > > > > > > features. From the ease of use perspective, ideally, we
> > > shouldn't
> > > > > > > > require a
> > > > > > > > > new user to have an extra step such as running a bootstrap
> > > script
> > > > > > > unless
> > > > > > > > > it's truly necessary. If someone has a special need (all
> the
> > > > cases
> > > > > > you
> > > > > > > > > mentioned seem special cases?), they can configure a mode
> > such
> > > > that
> > > > > > > > > features are enabled/disabled manually.
> > > > > > > > >
> > > > > > > > > Jun
> > > > > > > > >
> > > > > > > > > On Fri, Apr 3, 2020 at 5:54 PM Kowshik Prakasam <
> > > > > > > kpraka...@confluent.io>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Jun,
> > > > > > > > > >
> > > > > > > > > > Thanks for the feedback and suggestions. Please find my
> > > > response
> > > > > > > below.
> > > > > > > > > >
> > > > > > > > > > > 100.6 For every new request, the admin needs to control
> > who
> > > > is
> > > > > > > > allowed
> > > > > > > > > to
> > > > > > > > > > > issue that request if security is enabled. So, we need
> to
> > > > > assign
> > > > > > > the
> > > > > > > > > new
> > > > > > > > > > > request a ResourceType and possible AclOperations. See
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-455%3A+Create+an+Administrative+API+for+Replica+Reassignment
> > > > > > > > > > > as an example.
> > > > > > > > > >
> > > > > > > > > > (Kowshik): I don't see any reference to the words
> > > ResourceType
> > > > or
> > > > > > > > > > AclOperations
> > > > > > > > > > in the KIP. Please let me know how I can use the KIP that
> > you
> > > > > > linked
> > > > > > > to
> > > > > > > > > > know how to
> > > > > > > > > > setup the appropriate ResourceType and/or
> ClusterOperation?
> > > > > > > > > >
> > > > > > > > > > > 105. If we change delete to disable, it's better to do
> > this
> > > > > > > > > consistently
> > > > > > > > > > in
> > > > > > > > > > > request protocol and admin api as well.
> > > > > > > > > >
> > > > > > > > > > (Kowshik): The API shouldn't be called 'disable' when it
> is
> > > > > > deleting
> > > > > > > a
> > > > > > > > > > feature.
> > > > > > > > > > I've just changed the KIP to use 'delete'. I don't have a
> > > > strong
> > > > > > > > > > preference.
> > > > > > > > > >
> > > > > > > > > > > 110. The minVersion/maxVersion for features use int64.
> > > > > Currently,
> > > > > > > our
> > > > > > > > > > > release version schema is major.minor.bugfix (e.g.
> > 2.5.0).
> > > > It's
> > > > > > > > > possible
> > > > > > > > > > > for new features to be included in minor releases too.
> > > Should
> > > > > we
> > > > > > > make
> > > > > > > > > the
> > > > > > > > > > > feature versioning match the release versioning?
> > > > > > > > > >
> > > > > > > > > > (Kowshik): The release version can be mapped to a set of
> > > > feature
> > > > > > > > > versions,
> > > > > > > > > > and this can be done, for example in the tool (or even
> > > external
> > > > > to
> > > > > > > the
> > > > > > > > > > tool).
> > > > > > > > > > Can you please clarify what I'm missing?
> > > > > > > > > >
> > > > > > > > > > > 111. "During regular operations, the data in the ZK
> node
> > > can
> > > > be
> > > > > > > > mutated
> > > > > > > > > > > only via a specific admin API served only by the
> > > > controller." I
> > > > > > am
> > > > > > > > > > > wondering why can't the controller auto finalize a
> > feature
> > > > > > version
> > > > > > > > > after
> > > > > > > > > > > all brokers are upgraded? For new users who download
> the
> > > > latest
> > > > > > > > version
> > > > > > > > > > to
> > > > > > > > > > > build a new cluster, it's inconvenient for them to have
> > to
> > > > > > manually
> > > > > > > > > > enable
> > > > > > > > > > > each feature.
> > > > > > > > > >
> > > > > > > > > > (Kowshik): I agree that there is a trade-off here, but it
> > > will
> > > > > help
> > > > > > > > > > to decide whether the automation can be thought through
> in
> > > the
> > > > > > future
> > > > > > > > > > in a follow up KIP, or right now in this KIP. We may
> invest
> > > > > > > > > > in automation, but we have to decide whether we should do
> > it
> > > > > > > > > > now or later.
> > > > > > > > > >
> > > > > > > > > > For the inconvenience that you mentioned, do you think
> the
> > > > > problem
> > > > > > > that
> > > > > > > > > you
> > > > > > > > > > mentioned can be  overcome by asking for the cluster
> > operator
> > > > to
> > > > > > run
> > > > > > > a
> > > > > > > > > > bootstrap script  when he/she knows that a specific AK
> > > release
> > > > > has
> > > > > > > been
> > > > > > > > > > almost completely deployed in a cluster for the first
> time?
> > > > Idea
> > > > > is
> > > > > > > > that
> > > > > > > > > > the
> > > > > > > > > > bootstrap script will know how to map a specific AK
> release
> > > to
> > > > > > > > finalized
> > > > > > > > > > feature versions, and run the `kafka-features.sh` tool
> > > > > > appropriately
> > > > > > > > > > against
> > > > > > > > > > the cluster.
> > > > > > > > > >
> > > > > > > > > > Now, coming back to your automation proposal/question.
> > > > > > > > > > I do see the value of automated feature version
> > finalization,
> > > > > but I
> > > > > > > > also
> > > > > > > > > > see
> > > > > > > > > > that this will open up several questions and some risks,
> as
> > > > > > explained
> > > > > > > > > > below.
> > > > > > > > > > The answers to these depend on the definition of the
> > > automation
> > > > > we
> > > > > > > > choose
> > > > > > > > > > to build, and how well does it fit into a kafka
> deployment.
> > > > > > > > > > Basically, it can be unsafe for the controller to
> finalize
> > > > > feature
> > > > > > > > > version
> > > > > > > > > > upgrades automatically, without learning about the intent
> > of
> > > > the
> > > > > > > > cluster
> > > > > > > > > > operator.
> > > > > > > > > > 1. We would sometimes want to lock feature versions only
> > when
> > > > we
> > > > > > have
> > > > > > > > > > externally verified
> > > > > > > > > > the stability of the broker binary.
> > > > > > > > > > 2. Sometimes only the cluster operator knows that a
> cluster
> > > > > upgrade
> > > > > > > is
> > > > > > > > > > complete,
> > > > > > > > > > and new brokers are highly unlikely to join the cluster.
> > > > > > > > > > 3. Only the cluster operator knows that the intent is to
> > > deploy
> > > > > the
> > > > > > > > same
> > > > > > > > > > version
> > > > > > > > > > of the new broker release across the entire cluster (i.e.
> > the
> > > > > > latest
> > > > > > > > > > downloaded version).
> > > > > > > > > > 4. For downgrades, it appears the controller still needs
> > some
> > > > > > > external
> > > > > > > > > > input
> > > > > > > > > > (such as the proposed tool) to finalize a feature version
> > > > > > downgrade.
> > > > > > > > > >
> > > > > > > > > > If we have automation, that automation can end up failing
> > in
> > > > some
> > > > > > of
> > > > > > > > the
> > > > > > > > > > cases
> > > > > > > > > > above. Then, we need a way to declare that the cluster is
> > > "not
> > > > > > ready"
> > > > > > > > if
> > > > > > > > > > the
> > > > > > > > > > controller cannot automatically finalize some basic
> > required
> > > > > > feature
> > > > > > > > > > version
> > > > > > > > > > upgrades across the cluster. We need to make the cluster
> > > > operator
> > > > > > > aware
> > > > > > > > > in
> > > > > > > > > > such a scenario (raise an alert or alike).
> > > > > > > > > >
> > > > > > > > > > > 112. DeleteFeaturesResponse: It seems the apiKey should
> > be
> > > 49
> > > > > > > instead
> > > > > > > > > of
> > > > > > > > > > 48.
> > > > > > > > > >
> > > > > > > > > > (Kowshik): Done.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > > Kowshik
> > > > > > > > > >
> > > > > > > > > > On Fri, Apr 3, 2020 at 11:24 AM Jun Rao <
> j...@confluent.io>
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi, Kowshik,
> > > > > > > > > > >
> > > > > > > > > > > Thanks for the reply. A few more comments below.
> > > > > > > > > > >
> > > > > > > > > > > 100.6 For every new request, the admin needs to control
> > who
> > > > is
> > > > > > > > allowed
> > > > > > > > > to
> > > > > > > > > > > issue that request if security is enabled. So, we need
> to
> > > > > assign
> > > > > > > the
> > > > > > > > > new
> > > > > > > > > > > request a ResourceType and possible AclOperations. See
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-455%3A+Create+an+Administrative+API+for+Replica+Reassignment
> > > > > > > > > > > as
> > > > > > > > > > > an example.
> > > > > > > > > > >
> > > > > > > > > > > 105. If we change delete to disable, it's better to do
> > this
> > > > > > > > > consistently
> > > > > > > > > > in
> > > > > > > > > > > request protocol and admin api as well.
> > > > > > > > > > >
> > > > > > > > > > > 110. The minVersion/maxVersion for features use int64.
> > > > > Currently,
> > > > > > > our
> > > > > > > > > > > release version schema is major.minor.bugfix (e.g.
> > 2.5.0).
> > > > It's
> > > > > > > > > possible
> > > > > > > > > > > for new features to be included in minor releases too.
> > > Should
> > > > > we
> > > > > > > make
> > > > > > > > > the
> > > > > > > > > > > feature versioning match the release versioning?
> > > > > > > > > > >
> > > > > > > > > > > 111. "During regular operations, the data in the ZK
> node
> > > can
> > > > be
> > > > > > > > mutated
> > > > > > > > > > > only via a specific admin API served only by the
> > > > controller." I
> > > > > > am
> > > > > > > > > > > wondering why can't the controller auto finalize a
> > feature
> > > > > > version
> > > > > > > > > after
> > > > > > > > > > > all brokers are upgraded? For new users who download
> the
> > > > latest
> > > > > > > > version
> > > > > > > > > > to
> > > > > > > > > > > build a new cluster, it's inconvenient for them to have
> > to
> > > > > > manually
> > > > > > > > > > enable
> > > > > > > > > > > each feature.
> > > > > > > > > > >
> > > > > > > > > > > 112. DeleteFeaturesResponse: It seems the apiKey should
> > be
> > > 49
> > > > > > > instead
> > > > > > > > > of
> > > > > > > > > > > 48.
> > > > > > > > > > >
> > > > > > > > > > > Jun
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Apr 3, 2020 at 1:27 AM Kowshik Prakasam <
> > > > > > > > > kpraka...@confluent.io>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hey Jun,
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks a lot for the great feedback! Please note that
> > the
> > > > > > design
> > > > > > > > > > > > has changed a little bit on the KIP, and we now
> > propagate
> > > > the
> > > > > > > > > finalized
> > > > > > > > > > > > features metadata only via ZK watches (instead of
> > > > > > > > > UpdateMetadataRequest
> > > > > > > > > > > > from the controller).
> > > > > > > > > > > >
> > > > > > > > > > > > Please find below my response to your
> > questions/feedback,
> > > > > with
> > > > > > > the
> > > > > > > > > > prefix
> > > > > > > > > > > > "(Kowshik):".
> > > > > > > > > > > >
> > > > > > > > > > > > > 100. UpdateFeaturesRequest/UpdateFeaturesResponse
> > > > > > > > > > > > > 100.1 Since this request waits for responses from
> > > > brokers,
> > > > > > > should
> > > > > > > > > we
> > > > > > > > > > > add
> > > > > > > > > > > > a
> > > > > > > > > > > > > timeout in the request (like createTopicRequest)?
> > > > > > > > > > > >
> > > > > > > > > > > > (Kowshik): Great point! Done. I have added a timeout
> > > field.
> > > > > > Note:
> > > > > > > > we
> > > > > > > > > no
> > > > > > > > > > > > longer
> > > > > > > > > > > > wait for responses from brokers, since the design has
> > > been
> > > > > > > changed
> > > > > > > > so
> > > > > > > > > > > that
> > > > > > > > > > > > the
> > > > > > > > > > > > features information is propagated via ZK.
> > Nevertheless,
> > > it
> > > > > is
> > > > > > > > right
> > > > > > > > > to
> > > > > > > > > > > > have a timeout
> > > > > > > > > > > > for the request.
> > > > > > > > > > > >
> > > > > > > > > > > > > 100.2 The response schema is a bit weird.
> Typically,
> > > the
> > > > > > > response
> > > > > > > > > > just
> > > > > > > > > > > > > shows an error code and an error message, instead
> of
> > > > > echoing
> > > > > > > the
> > > > > > > > > > > request.
> > > > > > > > > > > >
> > > > > > > > > > > > (Kowshik): Great point! Yeah, I have modified it to
> > just
> > > > > return
> > > > > > > an
> > > > > > > > > > error
> > > > > > > > > > > > code and a message.
> > > > > > > > > > > > Previously it was not echoing the "request", rather
> it
> > > was
> > > > > > > > returning
> > > > > > > > > > the
> > > > > > > > > > > > latest set of
> > > > > > > > > > > > cluster-wide finalized features (after applying the
> > > > updates).
> > > > > > But
> > > > > > > > you
> > > > > > > > > > are
> > > > > > > > > > > > right,
> > > > > > > > > > > > the additional info is not required, so I have
> removed
> > it
> > > > > from
> > > > > > > the
> > > > > > > > > > > response
> > > > > > > > > > > > schema.
> > > > > > > > > > > >
> > > > > > > > > > > > > 100.3 Should we add a separate request to
> > list/describe
> > > > the
> > > > > > > > > existing
> > > > > > > > > > > > > features?
> > > > > > > > > > > >
> > > > > > > > > > > > (Kowshik): This is already present in the KIP via the
> > > > > > > > > > 'DescribeFeatures'
> > > > > > > > > > > > Admin API,
> > > > > > > > > > > > which, underneath covers uses the ApiVersionsRequest
> to
> > > > > > > > list/describe
> > > > > > > > > > the
> > > > > > > > > > > > existing features. Please read the 'Tooling support'
> > > > section.
> > > > > > > > > > > >
> > > > > > > > > > > > > 100.4 We are mixing ADD_OR_UPDATE and DELETE in a
> > > single
> > > > > > > request.
> > > > > > > > > For
> > > > > > > > > > > > > DELETE, the version field doesn't make sense. So, I
> > > guess
> > > > > the
> > > > > > > > > broker
> > > > > > > > > > > just
> > > > > > > > > > > > > ignores this? An alternative way is to have a
> > separate
> > > > > > > > > > > > DeleteFeaturesRequest
> > > > > > > > > > > >
> > > > > > > > > > > > (Kowshik): Great point! I have modified the KIP now
> to
> > > > have 2
> > > > > > > > > separate
> > > > > > > > > > > > controller APIs
> > > > > > > > > > > > serving these different purposes:
> > > > > > > > > > > > 1. updateFeatures
> > > > > > > > > > > > 2. deleteFeatures
> > > > > > > > > > > >
> > > > > > > > > > > > > 100.5 In UpdateFeaturesResponse, we have "The
> > > > monotonically
> > > > > > > > > > increasing
> > > > > > > > > > > > > version of the metadata for finalized features." I
> am
> > > > > > wondering
> > > > > > > > why
> > > > > > > > > > the
> > > > > > > > > > > > > ordering is important?
> > > > > > > > > > > >
> > > > > > > > > > > > (Kowshik): In the latest KIP write-up, it is called
> > epoch
> > > > > > > (instead
> > > > > > > > of
> > > > > > > > > > > > version), and
> > > > > > > > > > > > it is just the ZK node version. Basically, this is
> the
> > > > epoch
> > > > > > for
> > > > > > > > the
> > > > > > > > > > > > cluster-wide
> > > > > > > > > > > > finalized feature version metadata. This metadata is
> > > served
> > > > > to
> > > > > > > > > clients
> > > > > > > > > > > via
> > > > > > > > > > > > the
> > > > > > > > > > > > ApiVersionsResponse (for reads). We propagate updates
> > > from
> > > > > the
> > > > > > > > > > > '/features'
> > > > > > > > > > > > ZK node
> > > > > > > > > > > > to all brokers, via ZK watches setup by each broker
> on
> > > the
> > > > > > > > > '/features'
> > > > > > > > > > > > node.
> > > > > > > > > > > >
> > > > > > > > > > > > Now here is why the ordering is important:
> > > > > > > > > > > > ZK watches don't propagate at the same time. As a
> > result,
> > > > the
> > > > > > > > > > > > ApiVersionsResponse
> > > > > > > > > > > > is eventually consistent across brokers. This can
> > > introduce
> > > > > > cases
> > > > > > > > > > > > where clients see an older lower epoch of the
> features
> > > > > > metadata,
> > > > > > > > > after
> > > > > > > > > > a
> > > > > > > > > > > > more recent
> > > > > > > > > > > > higher epoch was returned at a previous point in
> time.
> > We
> > > > > > expect
> > > > > > > > > > clients
> > > > > > > > > > > > to always employ the rule that the latest received
> > higher
> > > > > epoch
> > > > > > > of
> > > > > > > > > > > metadata
> > > > > > > > > > > > always trumps an older smaller epoch. Those clients
> > that
> > > > are
> > > > > > > > external
> > > > > > > > > > to
> > > > > > > > > > > > Kafka should strongly consider discovering the latest
> > > > > metadata
> > > > > > > once
> > > > > > > > > > > during
> > > > > > > > > > > > startup from the brokers, and if required refresh the
> > > > > metadata
> > > > > > > > > > > periodically
> > > > > > > > > > > > (to get the latest metadata).
> > > > > > > > > > > >
> > > > > > > > > > > > > 100.6 Could you specify the required ACL for this
> new
> > > > > > request?
> > > > > > > > > > > >
> > > > > > > > > > > > (Kowshik): What is ACL, and how could I find out
> which
> > > one
> > > > to
> > > > > > > > > specify?
> > > > > > > > > > > > Please could you provide me some pointers? I'll be
> glad
> > > to
> > > > > > update
> > > > > > > > the
> > > > > > > > > > > > KIP once I know the next steps.
> > > > > > > > > > > >
> > > > > > > > > > > > > 101. For the broker registration ZK node, should we
> > > bump
> > > > up
> > > > > > the
> > > > > > > > > > version
> > > > > > > > > > > > in
> > > > > > > > > > > > the json?
> > > > > > > > > > > >
> > > > > > > > > > > > (Kowshik): Great point! Done. I've increased the
> > version
> > > in
> > > > > the
> > > > > > > > > broker
> > > > > > > > > > > json
> > > > > > > > > > > > by 1.
> > > > > > > > > > > >
> > > > > > > > > > > > > 102. For the /features ZK node, not sure if we need
> > the
> > > > > epoch
> > > > > > > > > field.
> > > > > > > > > > > Each
> > > > > > > > > > > > > ZK node has an internal version field that is
> > > incremented
> > > > > on
> > > > > > > > every
> > > > > > > > > > > > update.
> > > > > > > > > > > >
> > > > > > > > > > > > (Kowshik): Great point! Done. I'm using the ZK node
> > > version
> > > > > > now,
> > > > > > > > > > instead
> > > > > > > > > > > of
> > > > > > > > > > > > explicitly
> > > > > > > > > > > > incremented epoch.
> > > > > > > > > > > >
> > > > > > > > > > > > > 103. "Enabling the actual semantics of a feature
> > > version
> > > > > > > > > cluster-wide
> > > > > > > > > > > is
> > > > > > > > > > > > > left to the discretion of the logic implementing
> the
> > > > > feature
> > > > > > > (ex:
> > > > > > > > > can
> > > > > > > > > > > be
> > > > > > > > > > > > > done via dynamic broker config)." Does that mean
> the
> > > > broker
> > > > > > > > > > > registration
> > > > > > > > > > > > ZK
> > > > > > > > > > > > > node will be updated dynamically when this happens?
> > > > > > > > > > > >
> > > > > > > > > > > > (Kowshik): Not really. The text was just conveying
> > that a
> > > > > > broker
> > > > > > > > > could
> > > > > > > > > > > > "know" of
> > > > > > > > > > > > a new feature version, but it does not mean the
> broker
> > > > should
> > > > > > > have
> > > > > > > > > also
> > > > > > > > > > > > activated the effects of the feature version. Knowing
> > vs
> > > > > > > activation
> > > > > > > > > > are 2
> > > > > > > > > > > > separate things,
> > > > > > > > > > > > and the latter can be achieved by dynamic config. I
> > have
> > > > > > reworded
> > > > > > > > the
> > > > > > > > > > > text
> > > > > > > > > > > > to
> > > > > > > > > > > > make this clear to the reader.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > 104. UpdateMetadataRequest
> > > > > > > > > > > > > 104.1 It would be useful to describe when the
> feature
> > > > > > metadata
> > > > > > > is
> > > > > > > > > > > > included
> > > > > > > > > > > > > in the request. My understanding is that it's only
> > > > included
> > > > > > if
> > > > > > > > (1)
> > > > > > > > > > > there
> > > > > > > > > > > > is
> > > > > > > > > > > > > a change to the finalized feature; (2) broker
> > restart;
> > > > (3)
> > > > > > > > > controller
> > > > > > > > > > > > > failover.
> > > > > > > > > > > > > 104.2 The new fields have the following versions.
> Why
> > > are
> > > > > the
> > > > > > > > > > versions
> > > > > > > > > > > 3+
> > > > > > > > > > > > > when the top version is bumped to 6?
> > > > > > > > > > > > >       "fields":  [
> > > > > > > > > > > > >         {"name": "Name", "type":  "string",
> > "versions":
> > > > > > "3+",
> > > > > > > > > > > > >           "about": "The name of the feature."},
> > > > > > > > > > > > >         {"name":  "Version", "type":  "int64",
> > > > "versions":
> > > > > > > "3+",
> > > > > > > > > > > > >           "about": "The finalized version for the
> > > > > feature."}
> > > > > > > > > > > > >       ]
> > > > > > > > > > > >
> > > > > > > > > > > > (Kowshik): With the new improved design, we have
> > > completely
> > > > > > > > > eliminated
> > > > > > > > > > > the
> > > > > > > > > > > > need to
> > > > > > > > > > > > use UpdateMetadataRequest. This is because we now
> rely
> > on
> > > > ZK
> > > > > to
> > > > > > > > > deliver
> > > > > > > > > > > the
> > > > > > > > > > > > notifications for changes to the '/features' ZK node.
> > > > > > > > > > > >
> > > > > > > > > > > > > 105. kafka-features.sh: Instead of using
> > update/delete,
> > > > > > perhaps
> > > > > > > > > it's
> > > > > > > > > > > > better
> > > > > > > > > > > > > to use enable/disable?
> > > > > > > > > > > >
> > > > > > > > > > > > (Kowshik): For delete, yes, I have changed it so that
> > we
> > > > > > instead
> > > > > > > > call
> > > > > > > > > > it
> > > > > > > > > > > > 'disable'.
> > > > > > > > > > > > However for 'update', it can now also refer to either
> > an
> > > > > > upgrade
> > > > > > > > or a
> > > > > > > > > > > > forced downgrade.
> > > > > > > > > > > > Therefore, I have left it the way it is, just calling
> > it
> > > as
> > > > > > just
> > > > > > > > > > > 'update'.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Cheers,
> > > > > > > > > > > > Kowshik
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Mar 31, 2020 at 6:51 PM Jun Rao <
> > > j...@confluent.io>
> > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi, Kowshik,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for the KIP. Looks good overall. A few
> > comments
> > > > > below.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 100. UpdateFeaturesRequest/UpdateFeaturesResponse
> > > > > > > > > > > > > 100.1 Since this request waits for responses from
> > > > brokers,
> > > > > > > should
> > > > > > > > > we
> > > > > > > > > > > add
> > > > > > > > > > > > a
> > > > > > > > > > > > > timeout in the request (like createTopicRequest)?
> > > > > > > > > > > > > 100.2 The response schema is a bit weird.
> Typically,
> > > the
> > > > > > > response
> > > > > > > > > > just
> > > > > > > > > > > > > shows an error code and an error message, instead
> of
> > > > > echoing
> > > > > > > the
> > > > > > > > > > > request.
> > > > > > > > > > > > > 100.3 Should we add a separate request to
> > list/describe
> > > > the
> > > > > > > > > existing
> > > > > > > > > > > > > features?
> > > > > > > > > > > > > 100.4 We are mixing ADD_OR_UPDATE and DELETE in a
> > > single
> > > > > > > request.
> > > > > > > > > For
> > > > > > > > > > > > > DELETE, the version field doesn't make sense. So, I
> > > guess
> > > > > the
> > > > > > > > > broker
> > > > > > > > > > > just
> > > > > > > > > > > > > ignores this? An alternative way is to have a
> > separate
> > > > > > > > > > > > > DeleteFeaturesRequest
> > > > > > > > > > > > > 100.5 In UpdateFeaturesResponse, we have "The
> > > > monotonically
> > > > > > > > > > increasing
> > > > > > > > > > > > > version of the metadata for finalized features." I
> am
> > > > > > wondering
> > > > > > > > why
> > > > > > > > > > the
> > > > > > > > > > > > > ordering is important?
> > > > > > > > > > > > > 100.6 Could you specify the required ACL for this
> new
> > > > > > request?
> > > > > > > > > > > > >
> > > > > > > > > > > > > 101. For the broker registration ZK node, should we
> > > bump
> > > > up
> > > > > > the
> > > > > > > > > > version
> > > > > > > > > > > > in
> > > > > > > > > > > > > the json?
> > > > > > > > > > > > >
> > > > > > > > > > > > > 102. For the /features ZK node, not sure if we need
> > the
> > > > > epoch
> > > > > > > > > field.
> > > > > > > > > > > Each
> > > > > > > > > > > > > ZK node has an internal version field that is
> > > incremented
> > > > > on
> > > > > > > > every
> > > > > > > > > > > > update.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 103. "Enabling the actual semantics of a feature
> > > version
> > > > > > > > > cluster-wide
> > > > > > > > > > > is
> > > > > > > > > > > > > left to the discretion of the logic implementing
> the
> > > > > feature
> > > > > > > (ex:
> > > > > > > > > can
> > > > > > > > > > > be
> > > > > > > > > > > > > done via dynamic broker config)." Does that mean
> the
> > > > broker
> > > > > > > > > > > registration
> > > > > > > > > > > > ZK
> > > > > > > > > > > > > node will be updated dynamically when this happens?
> > > > > > > > > > > > >
> > > > > > > > > > > > > 104. UpdateMetadataRequest
> > > > > > > > > > > > > 104.1 It would be useful to describe when the
> feature
> > > > > > metadata
> > > > > > > is
> > > > > > > > > > > > included
> > > > > > > > > > > > > in the request. My understanding is that it's only
> > > > included
> > > > > > if
> > > > > > > > (1)
> > > > > > > > > > > there
> > > > > > > > > > > > is
> > > > > > > > > > > > > a change to the finalized feature; (2) broker
> > restart;
> > > > (3)
> > > > > > > > > controller
> > > > > > > > > > > > > failover.
> > > > > > > > > > > > > 104.2 The new fields have the following versions.
> Why
> > > are
> > > > > the
> > > > > > > > > > versions
> > > > > > > > > > > 3+
> > > > > > > > > > > > > when the top version is bumped to 6?
> > > > > > > > > > > > >       "fields":  [
> > > > > > > > > > > > >         {"name": "Name", "type":  "string",
> > "versions":
> > > > > > "3+",
> > > > > > > > > > > > >           "about": "The name of the feature."},
> > > > > > > > > > > > >         {"name":  "Version", "type":  "int64",
> > > > "versions":
> > > > > > > "3+",
> > > > > > > > > > > > >           "about": "The finalized version for the
> > > > > feature."}
> > > > > > > > > > > > >       ]
> > > > > > > > > > > > >
> > > > > > > > > > > > > 105. kafka-features.sh: Instead of using
> > update/delete,
> > > > > > perhaps
> > > > > > > > > it's
> > > > > > > > > > > > better
> > > > > > > > > > > > > to use enable/disable?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Jun
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, Mar 31, 2020 at 5:29 PM Kowshik Prakasam <
> > > > > > > > > > > kpraka...@confluent.io
> > > > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hey Boyang,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for the great feedback! I have updated the
> > KIP
> > > > > based
> > > > > > > on
> > > > > > > > > your
> > > > > > > > > > > > > > feedback.
> > > > > > > > > > > > > > Please find my response below for your comments,
> > look
> > > > for
> > > > > > > > > sentences
> > > > > > > > > > > > > > starting
> > > > > > > > > > > > > > with "(Kowshik)" below.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 1. "When is it safe for the brokers to begin
> > > handling
> > > > > EOS
> > > > > > > > > > traffic"
> > > > > > > > > > > > > could
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > converted as "When is it safe for the brokers
> to
> > > > start
> > > > > > > > serving
> > > > > > > > > > new
> > > > > > > > > > > > > > > Exactly-Once(EOS) semantics" since EOS is not
> > > > explained
> > > > > > > > earlier
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > context.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > (Kowshik): Great point! Done.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2. In the *Explanation *section, the metadata
> > > version
> > > > > > > number
> > > > > > > > > part
> > > > > > > > > > > > > seems a
> > > > > > > > > > > > > > > bit blurred. Could you point a reference to
> later
> > > > > section
> > > > > > > > that
> > > > > > > > > we
> > > > > > > > > > > > going
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > store it in Zookeeper and update it every time
> > when
> > > > > there
> > > > > > > is
> > > > > > > > a
> > > > > > > > > > > > feature
> > > > > > > > > > > > > > > change?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > (Kowshik): Great point! Done. I've added a
> > reference
> > > in
> > > > > the
> > > > > > > > KIP.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 3. For the feature downgrade, although it's a
> > > > Non-goal
> > > > > of
> > > > > > > the
> > > > > > > > > > KIP,
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > features such as group coordinator semantics,
> > there
> > > > is
> > > > > no
> > > > > > > > legal
> > > > > > > > > > > > > scenario
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > perform a downgrade at all. So having downgrade
> > > door
> > > > > open
> > > > > > > is
> > > > > > > > > > pretty
> > > > > > > > > > > > > > > error-prone as human faults happen all the
> time.
> > > I'm
> > > > > > > assuming
> > > > > > > > > as
> > > > > > > > > > > new
> > > > > > > > > > > > > > > features are implemented, it's not very hard to
> > > add a
> > > > > > flag
> > > > > > > > > during
> > > > > > > > > > > > > feature
> > > > > > > > > > > > > > > creation to indicate whether this feature is
> > > > > > > "downgradable".
> > > > > > > > > > Could
> > > > > > > > > > > > you
> > > > > > > > > > > > > > > explain a bit more on the extra engineering
> > effort
> > > > for
> > > > > > > > shipping
> > > > > > > > > > > this
> > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > with downgrade protection in place?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > (Kowshik): Great point! I'd agree and disagree
> > here.
> > > > > While
> > > > > > I
> > > > > > > > > agree
> > > > > > > > > > > that
> > > > > > > > > > > > > > accidental
> > > > > > > > > > > > > > downgrades can cause problems, I also think
> > sometimes
> > > > > > > > downgrades
> > > > > > > > > > > should
> > > > > > > > > > > > > > be allowed for emergency reasons (not all
> > downgrades
> > > > > cause
> > > > > > > > > issues).
> > > > > > > > > > > > > > It is just subjective to the feature being
> > > downgraded.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > To be more strict about feature version
> > downgrades, I
> > > > > have
> > > > > > > > > modified
> > > > > > > > > > > the
> > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > proposing that we mandate a `--force-downgrade`
> > flag
> > > be
> > > > > > used
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > > > > UPDATE_FEATURES api
> > > > > > > > > > > > > > and the tooling, whenever the human is
> downgrading
> > a
> > > > > > > finalized
> > > > > > > > > > > feature
> > > > > > > > > > > > > > version.
> > > > > > > > > > > > > > Hopefully this should cover the requirement,
> until
> > we
> > > > > find
> > > > > > > the
> > > > > > > > > need
> > > > > > > > > > > for
> > > > > > > > > > > > > > advanced downgrade support.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 4. "Each broker’s supported dictionary of
> feature
> > > > > > versions
> > > > > > > > will
> > > > > > > > > > be
> > > > > > > > > > > > > > defined
> > > > > > > > > > > > > > > in the broker code." So this means in order to
> > > > > restrict a
> > > > > > > > > certain
> > > > > > > > > > > > > > feature,
> > > > > > > > > > > > > > > we need to start the broker first and then
> send a
> > > > > feature
> > > > > > > > > gating
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > > immediately, which introduces a time gap and
> the
> > > > > > > > > > intended-to-close
> > > > > > > > > > > > > > feature
> > > > > > > > > > > > > > > could actually serve request during this phase.
> > Do
> > > > you
> > > > > > > think
> > > > > > > > we
> > > > > > > > > > > > should
> > > > > > > > > > > > > > also
> > > > > > > > > > > > > > > support configurations as well so that admin
> user
> > > > could
> > > > > > > > freely
> > > > > > > > > > roll
> > > > > > > > > > > > up
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > cluster with all nodes complying the same
> feature
> > > > > gating,
> > > > > > > > > without
> > > > > > > > > > > > > > worrying
> > > > > > > > > > > > > > > about the turnaround time to propagate the
> > message
> > > > only
> > > > > > > after
> > > > > > > > > the
> > > > > > > > > > > > > cluster
> > > > > > > > > > > > > > > starts up?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > (Kowshik): This is a great point/question. One of
> > the
> > > > > > > > > expectations
> > > > > > > > > > > out
> > > > > > > > > > > > of
> > > > > > > > > > > > > > this KIP, which is
> > > > > > > > > > > > > > already followed in the broker, is the following.
> > > > > > > > > > > > > >  - Imagine at time T1 the broker starts up and
> > > > registers
> > > > > > it’s
> > > > > > > > > > > presence
> > > > > > > > > > > > in
> > > > > > > > > > > > > > ZK,
> > > > > > > > > > > > > >    along with advertising it’s supported
> features.
> > > > > > > > > > > > > >  - Imagine at a future time T2 the broker
> receives
> > > the
> > > > > > > > > > > > > > UpdateMetadataRequest
> > > > > > > > > > > > > >    from the controller, which contains the latest
> > > > > finalized
> > > > > > > > > > features
> > > > > > > > > > > as
> > > > > > > > > > > > > > seen by
> > > > > > > > > > > > > >    the controller. The broker validates this data
> > > > against
> > > > > > > it’s
> > > > > > > > > > > > supported
> > > > > > > > > > > > > > features to
> > > > > > > > > > > > > >    make sure there is no mismatch (it will
> shutdown
> > > if
> > > > > > there
> > > > > > > is
> > > > > > > > > an
> > > > > > > > > > > > > > incompatibility).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > It is expected that during the time between the 2
> > > > events
> > > > > T1
> > > > > > > and
> > > > > > > > > T2,
> > > > > > > > > > > the
> > > > > > > > > > > > > > broker is
> > > > > > > > > > > > > > almost a silent entity in the cluster. It does
> not
> > > add
> > > > > any
> > > > > > > > value
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > > > cluster, or carry
> > > > > > > > > > > > > > out any important broker activities. By
> > “important”,
> > > I
> > > > > mean
> > > > > > > it
> > > > > > > > is
> > > > > > > > > > not
> > > > > > > > > > > > > doing
> > > > > > > > > > > > > > mutations
> > > > > > > > > > > > > > on it’s persistence, not mutating critical
> > in-memory
> > > > > state,
> > > > > > > > won’t
> > > > > > > > > > be
> > > > > > > > > > > > > > serving
> > > > > > > > > > > > > > produce/fetch requests. Note it doesn’t even know
> > > it’s
> > > > > > > assigned
> > > > > > > > > > > > > partitions
> > > > > > > > > > > > > > until
> > > > > > > > > > > > > > it receives UpdateMetadataRequest from
> controller.
> > > > > Anything
> > > > > > > the
> > > > > > > > > > > broker
> > > > > > > > > > > > is
> > > > > > > > > > > > > > doing up
> > > > > > > > > > > > > > until this point is not damaging/useful.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I’ve clarified the above in the KIP, see this new
> > > > > section:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Incompatiblebrokerlifetime
> > > > > > > > > > > > > > .
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 5. "adding a new Feature, updating or deleting
> an
> > > > > > existing
> > > > > > > > > > > Feature",
> > > > > > > > > > > > > may
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > I misunderstood something, I thought the
> features
> > > are
> > > > > > > defined
> > > > > > > > > in
> > > > > > > > > > > > broker
> > > > > > > > > > > > > > > code, so admin could not really create a new
> > > feature?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > (Kowshik): Great point! You understood this
> right.
> > > Here
> > > > > > > adding
> > > > > > > > a
> > > > > > > > > > > > feature
> > > > > > > > > > > > > > means we are
> > > > > > > > > > > > > > adding a cluster-wide finalized *max* version
> for a
> > > > > feature
> > > > > > > > that
> > > > > > > > > > was
> > > > > > > > > > > > > > previously never finalized.
> > > > > > > > > > > > > > I have clarified this in the KIP now.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 6. I think we need a separate error code like
> > > > > > > > > > > > > FEATURE_UPDATE_IN_PROGRESS
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > reject a concurrent feature update request.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > (Kowshik): Great point! I have modified the KIP
> > > adding
> > > > > the
> > > > > > > > above
> > > > > > > > > > (see
> > > > > > > > > > > > > > 'Tooling support -> Admin API changes').
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 7. I think we haven't discussed the alternative
> > > > > solution
> > > > > > to
> > > > > > > > > pass
> > > > > > > > > > > the
> > > > > > > > > > > > > > > feature information through Zookeeper. Is that
> > > > > mentioned
> > > > > > in
> > > > > > > > the
> > > > > > > > > > KIP
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > justify why using UpdateMetadata is more
> > favorable?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > (Kowshik): Nice question! The broker reads
> > finalized
> > > > > > feature
> > > > > > > > info
> > > > > > > > > > > > stored
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > ZK,
> > > > > > > > > > > > > > only during startup when it does a validation.
> When
> > > > > serving
> > > > > > > > > > > > > > `ApiVersionsRequest`, the
> > > > > > > > > > > > > > broker does not read this info from ZK directly.
> > I'd
> > > > > > imagine
> > > > > > > > the
> > > > > > > > > > risk
> > > > > > > > > > > > is
> > > > > > > > > > > > > > that it can increase
> > > > > > > > > > > > > > the ZK read QPS which can be a bottleneck for the
> > > > system.
> > > > > > > > Today,
> > > > > > > > > in
> > > > > > > > > > > > Kafka
> > > > > > > > > > > > > > we use the
> > > > > > > > > > > > > > controller to fan out ZK updates to brokers and
> we
> > > want
> > > > > to
> > > > > > > > stick
> > > > > > > > > to
> > > > > > > > > > > > that
> > > > > > > > > > > > > > pattern to avoid
> > > > > > > > > > > > > > the ZK read bottleneck when serving
> > > > `ApiVersionsRequest`.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 8. I was under the impression that user could
> > > > > configure a
> > > > > > > > range
> > > > > > > > > > of
> > > > > > > > > > > > > > > supported versions, what's the trade-off for
> > > allowing
> > > > > > > single
> > > > > > > > > > > > finalized
> > > > > > > > > > > > > > > version only?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > (Kowshik): Great question! The finalized version
> > of a
> > > > > > feature
> > > > > > > > > > > basically
> > > > > > > > > > > > > > refers to
> > > > > > > > > > > > > > the cluster-wide finalized feature "maximum"
> > version.
> > > > For
> > > > > > > > > example,
> > > > > > > > > > if
> > > > > > > > > > > > the
> > > > > > > > > > > > > > 'group_coordinator' feature
> > > > > > > > > > > > > > has the finalized version set to 10, then, it
> means
> > > > that
> > > > > > > > > > cluster-wide
> > > > > > > > > > > > all
> > > > > > > > > > > > > > versions upto v10 are
> > > > > > > > > > > > > > supported for this feature. However, note that if
> > > some
> > > > > > > version
> > > > > > > > > (ex:
> > > > > > > > > > > v0)
> > > > > > > > > > > > > > gets deprecated
> > > > > > > > > > > > > > for this feature, then we don’t convey that using
> > > this
> > > > > > scheme
> > > > > > > > > (also
> > > > > > > > > > > > > > supporting deprecation is a non-goal).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > (Kowshik): I’ve now modified the KIP at all
> points,
> > > > > > refering
> > > > > > > to
> > > > > > > > > > > > finalized
> > > > > > > > > > > > > > feature "maximum" versions.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 9. One minor syntax fix: Note that here the
> > > "client"
> > > > > here
> > > > > > > may
> > > > > > > > > be
> > > > > > > > > > a
> > > > > > > > > > > > > > producer
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > (Kowshik): Great point! Done.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > Kowshik
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, Mar 31, 2020 at 1:17 PM Boyang Chen <
> > > > > > > > > > > > reluctanthero...@gmail.com>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hey Kowshik,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > thanks for the revised KIP. Got a couple of
> > > > questions:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 1. "When is it safe for the brokers to begin
> > > handling
> > > > > EOS
> > > > > > > > > > traffic"
> > > > > > > > > > > > > could
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > converted as "When is it safe for the brokers
> to
> > > > start
> > > > > > > > serving
> > > > > > > > > > new
> > > > > > > > > > > > > > > Exactly-Once(EOS) semantics" since EOS is not
> > > > explained
> > > > > > > > earlier
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > context.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2. In the *Explanation *section, the metadata
> > > version
> > > > > > > number
> > > > > > > > > part
> > > > > > > > > > > > > seems a
> > > > > > > > > > > > > > > bit blurred. Could you point a reference to
> later
> > > > > section
> > > > > > > > that
> > > > > > > > > we
> > > > > > > > > > > > going
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > store it in Zookeeper and update it every time
> > when
> > > > > there
> > > > > > > is
> > > > > > > > a
> > > > > > > > > > > > feature
> > > > > > > > > > > > > > > change?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 3. For the feature downgrade, although it's a
> > > > Non-goal
> > > > > of
> > > > > > > the
> > > > > > > > > > KIP,
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > features such as group coordinator semantics,
> > there
> > > > is
> > > > > no
> > > > > > > > legal
> > > > > > > > > > > > > scenario
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > perform a downgrade at all. So having downgrade
> > > door
> > > > > open
> > > > > > > is
> > > > > > > > > > pretty
> > > > > > > > > > > > > > > error-prone as human faults happen all the
> time.
> > > I'm
> > > > > > > assuming
> > > > > > > > > as
> > > > > > > > > > > new
> > > > > > > > > > > > > > > features are implemented, it's not very hard to
> > > add a
> > > > > > flag
> > > > > > > > > during
> > > > > > > > > > > > > feature
> > > > > > > > > > > > > > > creation to indicate whether this feature is
> > > > > > > "downgradable".
> > > > > > > > > > Could
> > > > > > > > > > > > you
> > > > > > > > > > > > > > > explain a bit more on the extra engineering
> > effort
> > > > for
> > > > > > > > shipping
> > > > > > > > > > > this
> > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > with downgrade protection in place?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 4. "Each broker’s supported dictionary of
> feature
> > > > > > versions
> > > > > > > > will
> > > > > > > > > > be
> > > > > > > > > > > > > > defined
> > > > > > > > > > > > > > > in the broker code." So this means in order to
> > > > > restrict a
> > > > > > > > > certain
> > > > > > > > > > > > > > feature,
> > > > > > > > > > > > > > > we need to start the broker first and then
> send a
> > > > > feature
> > > > > > > > > gating
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > > immediately, which introduces a time gap and
> the
> > > > > > > > > > intended-to-close
> > > > > > > > > > > > > > feature
> > > > > > > > > > > > > > > could actually serve request during this phase.
> > Do
> > > > you
> > > > > > > think
> > > > > > > > we
> > > > > > > > > > > > should
> > > > > > > > > > > > > > also
> > > > > > > > > > > > > > > support configurations as well so that admin
> user
> > > > could
> > > > > > > > freely
> > > > > > > > > > roll
> > > > > > > > > > > > up
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > cluster with all nodes complying the same
> feature
> > > > > gating,
> > > > > > > > > without
> > > > > > > > > > > > > > worrying
> > > > > > > > > > > > > > > about the turnaround time to propagate the
> > message
> > > > only
> > > > > > > after
> > > > > > > > > the
> > > > > > > > > > > > > cluster
> > > > > > > > > > > > > > > starts up?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 5. "adding a new Feature, updating or deleting
> an
> > > > > > existing
> > > > > > > > > > > Feature",
> > > > > > > > > > > > > may
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > I misunderstood something, I thought the
> features
> > > are
> > > > > > > defined
> > > > > > > > > in
> > > > > > > > > > > > broker
> > > > > > > > > > > > > > > code, so admin could not really create a new
> > > feature?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 6. I think we need a separate error code like
> > > > > > > > > > > > > FEATURE_UPDATE_IN_PROGRESS
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > reject a concurrent feature update request.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 7. I think we haven't discussed the alternative
> > > > > solution
> > > > > > to
> > > > > > > > > pass
> > > > > > > > > > > the
> > > > > > > > > > > > > > > feature information through Zookeeper. Is that
> > > > > mentioned
> > > > > > in
> > > > > > > > the
> > > > > > > > > > KIP
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > justify why using UpdateMetadata is more
> > favorable?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 8. I was under the impression that user could
> > > > > configure a
> > > > > > > > range
> > > > > > > > > > of
> > > > > > > > > > > > > > > supported versions, what's the trade-off for
> > > allowing
> > > > > > > single
> > > > > > > > > > > > finalized
> > > > > > > > > > > > > > > version only?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 9. One minor syntax fix: Note that here the
> > > "client"
> > > > > here
> > > > > > > may
> > > > > > > > > be
> > > > > > > > > > a
> > > > > > > > > > > > > > producer
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Boyang
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Mon, Mar 30, 2020 at 4:53 PM Colin McCabe <
> > > > > > > > > cmcc...@apache.org
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Thu, Mar 26, 2020, at 19:24, Kowshik
> > Prakasam
> > > > > wrote:
> > > > > > > > > > > > > > > > > Hi Colin,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks for the feedback! I've changed the
> KIP
> > > to
> > > > > > > address
> > > > > > > > > your
> > > > > > > > > > > > > > > > > suggestions.
> > > > > > > > > > > > > > > > > Please find below my explanation. Here is a
> > > link
> > > > to
> > > > > > KIP
> > > > > > > > > 584:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features
> > > > > > > > > > > > > > > > > .
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 1. '__data_version__' is the version of the
> > > > > finalized
> > > > > > > > > feature
> > > > > > > > > > > > > > metadata
> > > > > > > > > > > > > > > > > (i.e. actual ZK node contents), while the
> > > > > > > > > > '__schema_version__'
> > > > > > > > > > > is
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > version of the schema of the data persisted
> > in
> > > > ZK.
> > > > > > > These
> > > > > > > > > > serve
> > > > > > > > > > > > > > > different
> > > > > > > > > > > > > > > > > purposes. '__data_version__' is is useful
> > > mainly
> > > > to
> > > > > > > > clients
> > > > > > > > > > > > during
> > > > > > > > > > > > > > > reads,
> > > > > > > > > > > > > > > > > to differentiate between the 2 versions of
> > > > > eventually
> > > > > > > > > > > consistent
> > > > > > > > > > > > > > > > 'finalized
> > > > > > > > > > > > > > > > > features' metadata (i.e. larger metadata
> > > version
> > > > is
> > > > > > > more
> > > > > > > > > > > recent).
> > > > > > > > > > > > > > > > > '__schema_version__' provides an additional
> > > > degree
> > > > > of
> > > > > > > > > > > > flexibility,
> > > > > > > > > > > > > > > where
> > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > we decide to change the schema for
> > '/features'
> > > > node
> > > > > > in
> > > > > > > ZK
> > > > > > > > > (in
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > future),
> > > > > > > > > > > > > > > > > then we can manage broker roll outs
> suitably
> > > > (i.e.
> > > > > > > > > > > > > > > > > serialization/deserialization of the ZK
> data
> > > can
> > > > be
> > > > > > > > handled
> > > > > > > > > > > > > safely).
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi Kowshik,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > If you're talking about a number that lets
> you
> > > know
> > > > > if
> > > > > > > data
> > > > > > > > > is
> > > > > > > > > > > more
> > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > less recent, we would typically call that an
> > > epoch,
> > > > > and
> > > > > > > > not a
> > > > > > > > > > > > > version.
> > > > > > > > > > > > > > > For
> > > > > > > > > > > > > > > > the ZK data structures, the word "version" is
> > > > > typically
> > > > > > > > > > reserved
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > > describing changes to the overall schema of
> the
> > > > data
> > > > > > that
> > > > > > > > is
> > > > > > > > > > > > written
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > ZooKeeper.  We don't even really change the
> > > > "version"
> > > > > > of
> > > > > > > > > those
> > > > > > > > > > > > > schemas
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > much, since most changes are
> > > backwards-compatible.
> > > > > But
> > > > > > > we
> > > > > > > > do
> > > > > > > > > > > > include
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > version field just in case.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I don't think we really need an epoch here,
> > > though,
> > > > > > since
> > > > > > > > we
> > > > > > > > > > can
> > > > > > > > > > > > just
> > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > at the broker epoch.  Whenever the broker
> > > > registers,
> > > > > > its
> > > > > > > > > epoch
> > > > > > > > > > > will
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > greater than the previous broker epoch.  And
> > the
> > > > > newly
> > > > > > > > > > registered
> > > > > > > > > > > > > data
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > take priority.  This will be a lot simpler
> than
> > > > > adding
> > > > > > a
> > > > > > > > > > separate
> > > > > > > > > > > > > epoch
> > > > > > > > > > > > > > > > system, I think.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2. Regarding admin client needing min and
> max
> > > > > > > > information -
> > > > > > > > > > you
> > > > > > > > > > > > are
> > > > > > > > > > > > > > > > right!
> > > > > > > > > > > > > > > > > I've changed the KIP such that the Admin
> API
> > > also
> > > > > > > allows
> > > > > > > > > the
> > > > > > > > > > > user
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > read
> > > > > > > > > > > > > > > > > 'supported features' from a specific
> broker.
> > > > Please
> > > > > > > look
> > > > > > > > at
> > > > > > > > > > the
> > > > > > > > > > > > > > section
> > > > > > > > > > > > > > > > > "Admin API changes".
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 3. Regarding the use of `long` vs `Long` -
> it
> > > was
> > > > > not
> > > > > > > > > > > deliberate.
> > > > > > > > > > > > > > I've
> > > > > > > > > > > > > > > > > improved the KIP to just use `long` at all
> > > > places.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Sounds good.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 4. Regarding kafka.admin.FeatureCommand
> tool
> > -
> > > > you
> > > > > > are
> > > > > > > > > right!
> > > > > > > > > > > > I've
> > > > > > > > > > > > > > > > updated
> > > > > > > > > > > > > > > > > the KIP sketching the functionality
> provided
> > by
> > > > > this
> > > > > > > > tool,
> > > > > > > > > > with
> > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > examples. Please look at the section
> "Tooling
> > > > > support
> > > > > > > > > > > examples".
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thank you!
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks, Kowshik.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > cheers,
> > > > > > > > > > > > > > > > Colin
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > Kowshik
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Wed, Mar 25, 2020 at 11:31 PM Colin
> > McCabe <
> > > > > > > > > > > > cmcc...@apache.org>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks, Kowshik, this looks good.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > In the "Schema" section, do we really
> need
> > > both
> > > > > > > > > > > > > __schema_version__
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > __data_version__?  Can we just have a
> > single
> > > > > > version
> > > > > > > > > field
> > > > > > > > > > > > here?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Shouldn't the Admin(Client) function have
> > > some
> > > > > way
> > > > > > to
> > > > > > > > get
> > > > > > > > > > the
> > > > > > > > > > > > min
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > max
> > > > > > > > > > > > > > > > > > information that we're exposing as
> well?  I
> > > > guess
> > > > > > we
> > > > > > > > > could
> > > > > > > > > > > have
> > > > > > > > > > > > > > min,
> > > > > > > > > > > > > > > > max,
> > > > > > > > > > > > > > > > > > and current.  Unrelated: is the use of
> Long
> > > > > rather
> > > > > > > than
> > > > > > > > > > long
> > > > > > > > > > > > > > > deliberate
> > > > > > > > > > > > > > > > > > here?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > It would be good to describe how the
> > command
> > > > line
> > > > > > > tool
> > > > > > > > > > > > > > > > > > kafka.admin.FeatureCommand will work.
> For
> > > > > example
> > > > > > > the
> > > > > > > > > > flags
> > > > > > > > > > > > that
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > take and the output that it will generate
> > to
> > > > > > STDOUT.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > cheers,
> > > > > > > > > > > > > > > > > > Colin
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Tue, Mar 24, 2020, at 17:08, Kowshik
> > > > Prakasam
> > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I've opened KIP-584
> > > > > > > > > > > > > <https://issues.apache.org/jira/browse/KIP-584> <
> > > > > > > > > > > > > > https://issues.apache.org/jira/browse/KIP-584
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > > > is intended to provide a versioning
> > scheme
> > > > for
> > > > > > > > > features.
> > > > > > > > > > > I'd
> > > > > > > > > > > > > like
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > this thread to discuss the same. I'd
> > > > appreciate
> > > > > > any
> > > > > > > > > > > feedback
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > this.
> > > > > > > > > > > > > > > > > > > Here
> > > > > > > > > > > > > > > > > > > is a link to KIP-584
> > > > > > > > > > > > > <https://issues.apache.org/jira/browse/KIP-584>:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features
> > > > > > > > > > > > > > > > > > >  .
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thank you!
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > > > Kowshik
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to