Hi Colin,

Thanks for the explanation! I agree with you, and I have updated the KIP.
Here is a link to relevant section:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Controller:ZKnodebootstrapwithdefaultvalues


Cheers,
Kowshik

On Fri, Apr 24, 2020 at 8:50 PM Colin McCabe <cmcc...@apache.org> wrote:

> On Fri, Apr 24, 2020, at 00:01, Kowshik Prakasam wrote:
> > (Kowshik): Great point! However for case #1, I'm not sure why we need to
> > create a '/features' ZK node with disabled features. Instead, do you see
> > any drawback if we just do not create it? i.e. if IBP is less than 2.6,
> the
> > controller treats the case as though the versioning system is completely
> > disabled, and would not create a non-existing '/features' node.
>
> Hi Kowshik,
>
> When the IBP is less than 2.6, but the software has been upgraded to a
> state where it supports this KIP, that
>  means the user is upgrading from an earlier version of the software.  In
> this case, we want to start with all the features disabled and allow the
> user to enable them when they are ready.
>
> Enabling all the possible features immediately after an upgrade could be
> harmful to the cluster.  On the other hand, for a new cluster, we do want
> to enable all the possible features immediately . I was proposing this as a
> way to distinguish the two cases (since the new cluster will never be
> started with an old IBP).
>
> > Colin MccCabe wrote:
> > > And now, something a little bit bigger (sorry).  For finalized
> features,
> > > why do we need both min_version_level and max_version_level?  Assuming
> that
> > > we want all the brokers to be on the same feature version level, we
> really only care
> > > about three numbers for each feature, right?  The minimum supported
> version
> > > level, the maximum supported version level, and the current active
> version level.
> >
> > > We don't actually want different brokers to be on different versions of
> > > the same feature, right?  So we can just have one number for current
> > > version level, rather than two.  At least that's what I was thinking
> -- let
> > > me know if I missed something.
> >
> > (Kowshik): It is my understanding that the "current active version level"
> > that you have mentioned, is the "max_version_level". But we still
> > maintain/publish both min and max version levels, because, the detail
> about
> > min level is useful to external clients. This is described below.
> >
> > For any feature F, think of the closed range: [min_version_level,
> > max_version_level] as the range of finalized versions, that's guaranteed
> to
> > be supported by all brokers in the cluster.
> >  - "max_version_level" is the finalized highest common version among all
> > brokers,
> >  - "min_version_level" is the finalized lowest common version among all
> > brokers.
> >
> > Next, think of "client" here as the "user of the new feature versions
> > system". Imagine that such a client learns about finalized feature
> > versions, and exercises some logic based on the version. These clients
> can
> > be of 2 types:
> > 1. Some part of the broker code itself could behave like a client trying
> to
> > use some feature that's "internal" to the broker cluster. Such a client
> > would learn the latest finalized features via ZK.
> > 2. An external system (ex: Streams) could behave like a client, trying to
> > use some "external" facing feature. Such a client would learn latest
> > finalized features via ApiVersionsRequest. Ex: group_coordinator feature
> > described in the KIP.
> >
> > Next, imagine that for F, the max_version_level is successfully bumped by
> > +1 (via Controller API). Now it is guaranteed that all brokers (i.e.
> > internal clients) understand max_version_level + 1. However, it is still
> > not guaranteed that all external clients have support for (or have
> > activated) the logic for the newer version. Why? Because, this is
> > subjective as explained next:
> >
> > 1. On one hand, imagine F as an internal feature only relevant to
> Brokers.
> > The binary for the internal client logic is controlled by Broker cluster
> > deployments. When shipping a new Broker release, we wouldn't bump max
> > "supported" feature version for F by 1, unless we have introduced some
> new
> > logic (with a potentially breaking change) in the Broker. Furthermore,
> such
> > feature logic in the broker should/will not be implemented in a way that
> it
> > would activate logic for an older feature version after it has migrated
> to
> > using the logic for a newer feature version (because this could break the
> > cluster!). For these cases, max_version_level will be very useful for
> > decision making.
> >
> > 2. On the other hand, imagine F as an external facing feature. External
> > clients are not within the control of Broker cluster. An external client
> > may not have upgraded it's code (yet) to use 'max_version_level + 1'.
> But,
> > the Kafka cluster could have been deployed with support for
> > 'max_version_level + 1' of an external facing feature. Now, these
> external
> > clients (until an upgrade) are benefitted in learning "what is the lowest
> > common version for F among all brokers?". This is where the
> > "min_version_level" becomes useful. Using this, a client could learn the
> > specific supported versions that's lower than max_version_level (instead
> of
> > assuming that all brokers support the range: [1, max_version_level]). For
> > example, if the cluster deprecates "min_version_level", then the client
> > becomes aware because it periodically learns the latest
> "min_version_level"
> > via ApiVersionsRequest.
> >
>
> Thanks for the explanation.  I agree that this does make sense when you
> take the client perspective into account.
>
> best,
> Colin
>
>
> >
> > Cheers,
> > Kowshik
> >
> >
> >
> >
> > On Thu, Apr 23, 2020 at 12:07 PM Colin McCabe <cmcc...@apache.org>
> wrote:
> >
> > > Hi Kowshik,
> > >
> > > Thanks again for working on this-- it looks great.  I went over the KIP
> > > again and have a few more comments.
> > >
> > > ===
> > >
> > > It would be good to note that deprecating a feature version (in other
> > > words, increasing minVersionLevel on the broker) is an incompatible
> change,
> > > which requires a major release of Kafka.
> > >
> > > ===
> > >
> > > I think the strategy for transitioning from not having a /features
> znode
> > > to having one could use some work. The current proposal is to wait for
> all
> > > the brokers to fill in their feature znodes and then pick the highest
> > > common versions.  But that requires blocking in the controller startup
> code
> > > until the whole cluster is active (technically, a point in time which
> we
> > > never really know that we have reached...)
> > >
> > > Instead, I think it would be better to have a strategy like this:
> > > 1. If the controller comes up and there is no /features znode AND the
> IBP
> > > is less than 2.6, create a /features znode where all the features are
> > > disabled.
> > > 2. If the controller comes up and there is no /features znode AND the
> IBP
> > > is greater than or equal to 2.6, create a /features znode where all the
> > > features are enabled at the highest versions supported by the
> controller.
> > >
> > > People upgrading from the pre-KIP-584
> > > <https://issues.apache.org/jira/browse/KIP-584> world will end up in
> case
> > > #1 because they have to do a double roll to upgrade, and during the
> first
> > > roll, the IBP is unchanged.  People creating new clusters from scratch
> will
> > > end up in case #2, which is what we want since we don't want a brand
> new
> > > cluster to be using old feature flag versions.
> > >
> > > ===
> > >
> > > UpdateFeaturesResponse#ErrorMessage should specify nullableVersions
> since
> > > null is a valid value here
> > >
> > > Also, in the message format, the tags we use need to start at 0.
> > >
> > > ===
> > >
> > > I don't think we need the FEATURE_UPDATE_IN_PROGRESS error code.  The
> > > controller is basically single-threaded and will only do one of these
> > > operations at once.  Even if it weren't, though, we could simply block
> the
> > > second operation behind the first one.
> > >
> > > ===
> > >
> > > For updateFeatures, it would be good to specify that if a single
> feature
> > > version update in the batch can't be done, none of them are done.  I
> think
> > > this was the intention, but I wasn't able to find it spelled out
> (maybe i
> > > missed it).
> > >
> > > ===
> > >
> > > And now, something a little bit bigger (sorry).  For finalized
> features,
> > > why do we need both min_version_level and max_version_level?  Assuming
> that
> > > we want all the brokers to be on the same feature version level, we
> really
> > > only care about three numbers for each feature, right?  The minimum
> > > supported version level, the maximum supported version level, and the
> > > current active version level.
> > >
> > > We don't actually want different brokers to be on different versions of
> > > the same feature, right?  So we can just have one number for current
> > > version level, rather than two.  At least that's what I was thinking
> -- let
> > > me know if I missed something.
> > >
> > > best,
> > > Colin
> > >
> > >
> > > On Tue, Apr 21, 2020, at 13:01, Dhruvil Shah wrote:
> > > > Thanks for the KIP! +1 (non-binding)
> > > >
> > > > On Tue, Apr 21, 2020 at 6:09 AM David Jacot <dja...@confluent.io>
> wrote:
> > > >
> > > > > Great KIP, thanks! +1 (non-binding)
> > > > >
> > > > > On Fri, Apr 17, 2020 at 8:56 PM Guozhang Wang <wangg...@gmail.com>
> > > wrote:
> > > > >
> > > > > > Thanks for the great KIP Kowshik, +1 (binding).
> > > > > >
> > > > > > On Fri, Apr 17, 2020 at 11:22 AM Jun Rao <j...@confluent.io>
> wrote:
> > > > > >
> > > > > > > Hi, Kowshik,
> > > > > > >
> > > > > > > Thanks for the KIP. +1
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > > On Thu, Apr 16, 2020 at 11:14 AM Kowshik Prakasam <
> > > > > > kpraka...@confluent.io>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > I'd like to start a vote for KIP-584
> > > <https://issues.apache.org/jira/browse/KIP-584>. The link to the KIP
> can
> > > be
> > > > > found
> > > > > > > > here:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features
> > > > > > > > .
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Kowshik
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > -- Guozhang
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to