Hi Colin,

Thanks for the feedback. Those are very good points. I have made the
following changes to the KIP as you had suggested:
1. Included the `timeoutMs` field in the `UpdateFeaturesRequest` schema.
The initial implementation won't be making use of the field, but we can
always use it in the future as the need arises.
2. Modified the `FinalizedFeaturesEpoch` field in `ApiVersionsResponse` to
use int64. This is to avoid overflow problems in the future once ZK is gone.

I have also incorporated these changes into the versioning write path PR
that is currently under review: https://github.com/apache/kafka/pull/9001.


Cheers,
Kowshik



On Fri, Sep 25, 2020 at 4:57 PM Kowshik Prakasam <kpraka...@confluent.io>
wrote:

> Hi Jun,
>
> Thanks for the feedback. It's a very good point. I have now modified the
> KIP-584 write-up "goals" section a bit. It now mentions one of the goals as
> enabling rolling upgrades using a single restart (instead of 2). Also I
> have removed the text explicitly aiming for deprecation of IBP. Note that
> previously under "Potential features in Kafka" the IBP was mentioned under
> point (4) as a possible coarse-grained feature. Hopefully, now the 2
> sections of the KIP align with each other well.
>
>
> Cheers,
> Kowshik
>
>
> On Fri, Sep 25, 2020 at 2:03 PM Colin McCabe <cmcc...@apache.org> wrote:
>
>> On Tue, Sep 22, 2020, at 00:43, Kowshik Prakasam wrote:
>> > Hi all,
>> >
>> > I wanted to let you know that I have made the following changes to the
>> > KIP-584 write up. The purpose is to ensure the design is correct for a
>> few
>> > things which came up during implementation:
>> >
>>
>> Hi Kowshik,
>>
>> Thanks for the updates.
>>
>> >
>> > 1. Per FeatureUpdate error code: The UPDATE_FEATURES controller API is
>> no
>> > longer transactional. Going forward, we allow for individual
>> FeatureUpdate
>> > to succeed/fail in the request. As a result, the response schema now
>> > contains an error code per FeatureUpdate as well as a top-level error
>> code.
>> > Overall this is a better design because it better represents the nature
>> of
>> > the API: each FeatureUpdate in the request is independent of the other
>> > updates, and the controller can process/apply these independently to ZK.
>> > When an UPDATE_FEATURES request fails, this new design provides better
>> > clarity to the caller on which FeatureUpdate could not be applied (via
>> the
>> > individual error codes). In the previous design, we were unable to
>> achieve
>> > such an increased level of clarity in communicating the error codes.
>> >
>>
>> OK
>>
>> >
>> > 2. Due to #1, there were some minor changes required to the proposed
>> Admin
>> > APIs (describeFeatures and updateFeatures). A few unnecessary public
>> APIs
>> > have been removed, and couple essential ones have been added. The latest
>> > changes now represent the latest design.
>> >
>> > 3. The timeoutMs field has been removed from the the UPDATE_FEATURES API
>> > request, since it was not found to be required during implementation.
>> >
>>
>> Please don't get rid of timeoutMs.  timeoutMs is required if you want to
>> implement the ability to timeout the call if the controller can't get to it
>> in time.  This is important for avoiding congestion collapse where the
>> controller collapses under the weight of lots of retries of the same set of
>> calls.
>>
>> We may not be able to do it in the initial implementation, but we will
>> eventually implement this for all the controller-bound RPCs.
>>
>> > >
>> > > 2. Finalized feature version epoch data type has been made to be int32
>> > > (instead of int64). The reason is that the epoch value is the value
>> of ZK
>> > > node version, whose data type is int32.
>> > >
>>
>> Sorry, I missed this earlier.  Using 16 bit feature levels seems fine.
>> However, please don't use a 32-bit epoch here.  We deliberately made the
>> epoch 64 bits to avoid overflow problems in the future once ZK is gone.
>>
>> best,
>> Colin
>>
>> > > 3. Introduced a new 'status' field in the '/features' ZK node schema.
>> The
>> > > purpose is to implement Colin's earlier point for the strategy for
>> > > transitioning from not having a /features znode to having one. An
>> > > explanation has been provided in the following section of the KIP
>> detailing
>> > > the different cases:
>> > >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP584:Versioningschemeforfeatures-FeatureZKnodestatus
>> > > .
>> > >
>> > > Please let me know if you have any questions or concerns.
>> > >
>> > >
>> > > Cheers,
>> > > Kowshik
>> > >
>> > >
>> > >
>> > > Cheers,
>> > > Kowshik
>> > >
>> > > On Tue, Apr 28, 2020 at 11:24 PM Kowshik Prakasam <
>> kpraka...@confluent.io>
>> > > wrote:
>> > >
>> > >> Hi all,
>> > >>
>> > >> This KIP vote has been open for ~12 days. The summary of the votes is
>> > >> that we have 3 binding votes (Colin, Guozhang, Jun), and 3
>> non-binding
>> > >> votes (David, Dhruvil, Boyang). Therefore, the KIP vote passes. I'll
>> mark
>> > >> KIP as accepted and start working on the implementation.
>> > >>
>> > >> Thanks a lot!
>> > >>
>> > >>
>> > >> Cheers,
>> > >> Kowshik
>> > >>
>> > >> On Mon, Apr 27, 2020 at 12:15 PM Colin McCabe <cmcc...@apache.org>
>> wrote:
>> > >>
>> > >>> Thanks, Kowshik.  +1 (binding)
>> > >>>
>> > >>> best,
>> > >>> Colin
>> > >>>
>> > >>> On Sat, Apr 25, 2020, at 13:20, Kowshik Prakasam wrote:
>> > >>> > Hi Colin,
>> > >>> >
>> > >>> > Thanks for the explanation! I agree with you, and I have updated
>> the
>> > >>> > KIP.
>> > >>> > Here is a link to relevant section:
>> > >>> >
>> > >>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Controller:ZKnodebootstrapwithdefaultvalues
>> > >>> >
>> > >>> >
>> > >>> > Cheers,
>> > >>> > Kowshik
>> > >>> >
>> > >>> > On Fri, Apr 24, 2020 at 8:50 PM Colin McCabe <cmcc...@apache.org>
>> > >>> wrote:
>> > >>> >
>> > >>> > > On Fri, Apr 24, 2020, at 00:01, Kowshik Prakasam wrote:
>> > >>> > > > (Kowshik): Great point! However for case #1, I'm not sure why
>> we
>> > >>> need to
>> > >>> > > > create a '/features' ZK node with disabled features. Instead,
>> do
>> > >>> you see
>> > >>> > > > any drawback if we just do not create it? i.e. if IBP is less
>> than
>> > >>> 2.6,
>> > >>> > > the
>> > >>> > > > controller treats the case as though the versioning system is
>> > >>> completely
>> > >>> > > > disabled, and would not create a non-existing '/features'
>> node.
>> > >>> > >
>> > >>> > > Hi Kowshik,
>> > >>> > >
>> > >>> > > When the IBP is less than 2.6, but the software has been
>> upgraded to
>> > >>> a
>> > >>> > > state where it supports this KIP, that
>> > >>> > >  means the user is upgrading from an earlier version of the
>> > >>> software.  In
>> > >>> > > this case, we want to start with all the features disabled and
>> allow
>> > >>> the
>> > >>> > > user to enable them when they are ready.
>> > >>> > >
>> > >>> > > Enabling all the possible features immediately after an upgrade
>> > >>> could be
>> > >>> > > harmful to the cluster.  On the other hand, for a new cluster,
>> we do
>> > >>> want
>> > >>> > > to enable all the possible features immediately . I was
>> proposing
>> > >>> this as a
>> > >>> > > way to distinguish the two cases (since the new cluster will
>> never be
>> > >>> > > started with an old IBP).
>> > >>> > >
>> > >>> > > > Colin MccCabe wrote:
>> > >>> > > > > And now, something a little bit bigger (sorry).  For
>> finalized
>> > >>> > > features,
>> > >>> > > > > why do we need both min_version_level and max_version_level?
>> > >>> Assuming
>> > >>> > > that
>> > >>> > > > > we want all the brokers to be on the same feature version
>> level,
>> > >>> we
>> > >>> > > really only care
>> > >>> > > > > about three numbers for each feature, right?  The minimum
>> > >>> supported
>> > >>> > > version
>> > >>> > > > > level, the maximum supported version level, and the current
>> > >>> active
>> > >>> > > version level.
>> > >>> > > >
>> > >>> > > > > We don't actually want different brokers to be on different
>> > >>> versions of
>> > >>> > > > > the same feature, right?  So we can just have one number for
>> > >>> current
>> > >>> > > > > version level, rather than two.  At least that's what I was
>> > >>> thinking
>> > >>> > > -- let
>> > >>> > > > > me know if I missed something.
>> > >>> > > >
>> > >>> > > > (Kowshik): It is my understanding that the "current active
>> version
>> > >>> level"
>> > >>> > > > that you have mentioned, is the "max_version_level". But we
>> still
>> > >>> > > > maintain/publish both min and max version levels, because, the
>> > >>> detail
>> > >>> > > about
>> > >>> > > > min level is useful to external clients. This is described
>> below.
>> > >>> > > >
>> > >>> > > > For any feature F, think of the closed range:
>> [min_version_level,
>> > >>> > > > max_version_level] as the range of finalized versions, that's
>> > >>> guaranteed
>> > >>> > > to
>> > >>> > > > be supported by all brokers in the cluster.
>> > >>> > > >  - "max_version_level" is the finalized highest common version
>> > >>> among all
>> > >>> > > > brokers,
>> > >>> > > >  - "min_version_level" is the finalized lowest common version
>> > >>> among all
>> > >>> > > > brokers.
>> > >>> > > >
>> > >>> > > > Next, think of "client" here as the "user of the new feature
>> > >>> versions
>> > >>> > > > system". Imagine that such a client learns about finalized
>> feature
>> > >>> > > > versions, and exercises some logic based on the version. These
>> > >>> clients
>> > >>> > > can
>> > >>> > > > be of 2 types:
>> > >>> > > > 1. Some part of the broker code itself could behave like a
>> client
>> > >>> trying
>> > >>> > > to
>> > >>> > > > use some feature that's "internal" to the broker cluster.
>> Such a
>> > >>> client
>> > >>> > > > would learn the latest finalized features via ZK.
>> > >>> > > > 2. An external system (ex: Streams) could behave like a
>> client,
>> > >>> trying to
>> > >>> > > > use some "external" facing feature. Such a client would learn
>> > >>> latest
>> > >>> > > > finalized features via ApiVersionsRequest. Ex:
>> group_coordinator
>> > >>> feature
>> > >>> > > > described in the KIP.
>> > >>> > > >
>> > >>> > > > Next, imagine that for F, the max_version_level is
>> successfully
>> > >>> bumped by
>> > >>> > > > +1 (via Controller API). Now it is guaranteed that all brokers
>> > >>> (i.e.
>> > >>> > > > internal clients) understand max_version_level + 1. However,
>> it is
>> > >>> still
>> > >>> > > > not guaranteed that all external clients have support for (or
>> have
>> > >>> > > > activated) the logic for the newer version. Why? Because,
>> this is
>> > >>> > > > subjective as explained next:
>> > >>> > > >
>> > >>> > > > 1. On one hand, imagine F as an internal feature only
>> relevant to
>> > >>> > > Brokers.
>> > >>> > > > The binary for the internal client logic is controlled by
>> Broker
>> > >>> cluster
>> > >>> > > > deployments. When shipping a new Broker release, we wouldn't
>> bump
>> > >>> max
>> > >>> > > > "supported" feature version for F by 1, unless we have
>> introduced
>> > >>> some
>> > >>> > > new
>> > >>> > > > logic (with a potentially breaking change) in the Broker.
>> > >>> Furthermore,
>> > >>> > > such
>> > >>> > > > feature logic in the broker should/will not be implemented in
>> a
>> > >>> way that
>> > >>> > > it
>> > >>> > > > would activate logic for an older feature version after it has
>> > >>> migrated
>> > >>> > > to
>> > >>> > > > using the logic for a newer feature version (because this
>> could
>> > >>> break the
>> > >>> > > > cluster!). For these cases, max_version_level will be very
>> useful
>> > >>> for
>> > >>> > > > decision making.
>> > >>> > > >
>> > >>> > > > 2. On the other hand, imagine F as an external facing feature.
>> > >>> External
>> > >>> > > > clients are not within the control of Broker cluster. An
>> external
>> > >>> client
>> > >>> > > > may not have upgraded it's code (yet) to use
>> 'max_version_level +
>> > >>> 1'.
>> > >>> > > But,
>> > >>> > > > the Kafka cluster could have been deployed with support for
>> > >>> > > > 'max_version_level + 1' of an external facing feature. Now,
>> these
>> > >>> > > external
>> > >>> > > > clients (until an upgrade) are benefitted in learning "what
>> is the
>> > >>> lowest
>> > >>> > > > common version for F among all brokers?". This is where the
>> > >>> > > > "min_version_level" becomes useful. Using this, a client could
>> > >>> learn the
>> > >>> > > > specific supported versions that's lower than
>> max_version_level
>> > >>> (instead
>> > >>> > > of
>> > >>> > > > assuming that all brokers support the range: [1,
>> > >>> max_version_level]). For
>> > >>> > > > example, if the cluster deprecates "min_version_level", then
>> the
>> > >>> client
>> > >>> > > > becomes aware because it periodically learns the latest
>> > >>> > > "min_version_level"
>> > >>> > > > via ApiVersionsRequest.
>> > >>> > > >
>> > >>> > >
>> > >>> > > Thanks for the explanation.  I agree that this does make sense
>> when
>> > >>> you
>> > >>> > > take the client perspective into account.
>> > >>> > >
>> > >>> > > best,
>> > >>> > > Colin
>> > >>> > >
>> > >>> > >
>> > >>> > > >
>> > >>> > > > Cheers,
>> > >>> > > > Kowshik
>> > >>> > > >
>> > >>> > > >
>> > >>> > > >
>> > >>> > > >
>> > >>> > > > On Thu, Apr 23, 2020 at 12:07 PM Colin McCabe <
>> cmcc...@apache.org>
>> > >>> > > wrote:
>> > >>> > > >
>> > >>> > > > > Hi Kowshik,
>> > >>> > > > >
>> > >>> > > > > Thanks again for working on this-- it looks great.  I went
>> over
>> > >>> the KIP
>> > >>> > > > > again and have a few more comments.
>> > >>> > > > >
>> > >>> > > > > ===
>> > >>> > > > >
>> > >>> > > > > It would be good to note that deprecating a feature version
>> (in
>> > >>> other
>> > >>> > > > > words, increasing minVersionLevel on the broker) is an
>> > >>> incompatible
>> > >>> > > change,
>> > >>> > > > > which requires a major release of Kafka.
>> > >>> > > > >
>> > >>> > > > > ===
>> > >>> > > > >
>> > >>> > > > > I think the strategy for transitioning from not having a
>> > >>> /features
>> > >>> > > znode
>> > >>> > > > > to having one could use some work. The current proposal is
>> to
>> > >>> wait for
>> > >>> > > all
>> > >>> > > > > the brokers to fill in their feature znodes and then pick
>> the
>> > >>> highest
>> > >>> > > > > common versions.  But that requires blocking in the
>> controller
>> > >>> startup
>> > >>> > > code
>> > >>> > > > > until the whole cluster is active (technically, a point in
>> time
>> > >>> which
>> > >>> > > we
>> > >>> > > > > never really know that we have reached...)
>> > >>> > > > >
>> > >>> > > > > Instead, I think it would be better to have a strategy like
>> this:
>> > >>> > > > > 1. If the controller comes up and there is no /features
>> znode
>> > >>> AND the
>> > >>> > > IBP
>> > >>> > > > > is less than 2.6, create a /features znode where all the
>> > >>> features are
>> > >>> > > > > disabled.
>> > >>> > > > > 2. If the controller comes up and there is no /features
>> znode
>> > >>> AND the
>> > >>> > > IBP
>> > >>> > > > > is greater than or equal to 2.6, create a /features znode
>> where
>> > >>> all the
>> > >>> > > > > features are enabled at the highest versions supported by
>> the
>> > >>> > > controller.
>> > >>> > > > >
>> > >>> > > > > People upgrading from the pre-KIP-584
>> > >>> <https://issues.apache.org/jira/browse/KIP-584>
>> > >>> > > > > <https://issues.apache.org/jira/browse/KIP-584> world will
>> end
>> > >>> up in
>> > >>> > > case
>> > >>> > > > > #1 because they have to do a double roll to upgrade, and
>> during
>> > >>> the
>> > >>> > > first
>> > >>> > > > > roll, the IBP is unchanged.  People creating new clusters
>> from
>> > >>> scratch
>> > >>> > > will
>> > >>> > > > > end up in case #2, which is what we want since we don't
>> want a
>> > >>> brand
>> > >>> > > new
>> > >>> > > > > cluster to be using old feature flag versions.
>> > >>> > > > >
>> > >>> > > > > ===
>> > >>> > > > >
>> > >>> > > > > UpdateFeaturesResponse#ErrorMessage should specify
>> > >>> nullableVersions
>> > >>> > > since
>> > >>> > > > > null is a valid value here
>> > >>> > > > >
>> > >>> > > > > Also, in the message format, the tags we use need to start
>> at 0.
>> > >>> > > > >
>> > >>> > > > > ===
>> > >>> > > > >
>> > >>> > > > > I don't think we need the FEATURE_UPDATE_IN_PROGRESS error
>> > >>> code.  The
>> > >>> > > > > controller is basically single-threaded and will only do
>> one of
>> > >>> these
>> > >>> > > > > operations at once.  Even if it weren't, though, we could
>> simply
>> > >>> block
>> > >>> > > the
>> > >>> > > > > second operation behind the first one.
>> > >>> > > > >
>> > >>> > > > > ===
>> > >>> > > > >
>> > >>> > > > > For updateFeatures, it would be good to specify that if a
>> single
>> > >>> > > feature
>> > >>> > > > > version update in the batch can't be done, none of them are
>> > >>> done.  I
>> > >>> > > think
>> > >>> > > > > this was the intention, but I wasn't able to find it
>> spelled out
>> > >>> > > (maybe i
>> > >>> > > > > missed it).
>> > >>> > > > >
>> > >>> > > > > ===
>> > >>> > > > >
>> > >>> > > > > And now, something a little bit bigger (sorry).  For
>> finalized
>> > >>> > > features,
>> > >>> > > > > why do we need both min_version_level and max_version_level?
>> > >>> Assuming
>> > >>> > > that
>> > >>> > > > > we want all the brokers to be on the same feature version
>> level,
>> > >>> we
>> > >>> > > really
>> > >>> > > > > only care about three numbers for each feature, right?  The
>> > >>> minimum
>> > >>> > > > > supported version level, the maximum supported version
>> level,
>> > >>> and the
>> > >>> > > > > current active version level.
>> > >>> > > > >
>> > >>> > > > > We don't actually want different brokers to be on different
>> > >>> versions of
>> > >>> > > > > the same feature, right?  So we can just have one number for
>> > >>> current
>> > >>> > > > > version level, rather than two.  At least that's what I was
>> > >>> thinking
>> > >>> > > -- let
>> > >>> > > > > me know if I missed something.
>> > >>> > > > >
>> > >>> > > > > best,
>> > >>> > > > > Colin
>> > >>> > > > >
>> > >>> > > > >
>> > >>> > > > > On Tue, Apr 21, 2020, at 13:01, Dhruvil Shah wrote:
>> > >>> > > > > > Thanks for the KIP! +1 (non-binding)
>> > >>> > > > > >
>> > >>> > > > > > On Tue, Apr 21, 2020 at 6:09 AM David Jacot <
>> > >>> dja...@confluent.io>
>> > >>> > > wrote:
>> > >>> > > > > >
>> > >>> > > > > > > Great KIP, thanks! +1 (non-binding)
>> > >>> > > > > > >
>> > >>> > > > > > > On Fri, Apr 17, 2020 at 8:56 PM Guozhang Wang <
>> > >>> wangg...@gmail.com>
>> > >>> > > > > wrote:
>> > >>> > > > > > >
>> > >>> > > > > > > > Thanks for the great KIP Kowshik, +1 (binding).
>> > >>> > > > > > > >
>> > >>> > > > > > > > On Fri, Apr 17, 2020 at 11:22 AM Jun Rao <
>> j...@confluent.io
>> > >>> >
>> > >>> > > wrote:
>> > >>> > > > > > > >
>> > >>> > > > > > > > > Hi, Kowshik,
>> > >>> > > > > > > > >
>> > >>> > > > > > > > > Thanks for the KIP. +1
>> > >>> > > > > > > > >
>> > >>> > > > > > > > > Jun
>> > >>> > > > > > > > >
>> > >>> > > > > > > > > On Thu, Apr 16, 2020 at 11:14 AM Kowshik Prakasam <
>> > >>> > > > > > > > kpraka...@confluent.io>
>> > >>> > > > > > > > > wrote:
>> > >>> > > > > > > > >
>> > >>> > > > > > > > > > Hi all,
>> > >>> > > > > > > > > >
>> > >>> > > > > > > > > > I'd like to start a vote for KIP-584
>> > >>> <https://issues.apache.org/jira/browse/KIP-584>
>> > >>> > > > > <https://issues.apache.org/jira/browse/KIP-584>. The link
>> to
>> > >>> the KIP
>> > >>> > > can
>> > >>> > > > > be
>> > >>> > > > > > > found
>> > >>> > > > > > > > > > here:
>> > >>> > > > > > > > > >
>> > >>> > > > > > > > > >
>> > >>> > > > > > > > >
>> > >>> > > > > > > >
>> > >>> > > > > > >
>> > >>> > > > >
>> > >>> > >
>> > >>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features
>> > >>> > > > > > > > > > .
>> > >>> > > > > > > > > >
>> > >>> > > > > > > > > > Thanks!
>> > >>> > > > > > > > > >
>> > >>> > > > > > > > > >
>> > >>> > > > > > > > > > Cheers,
>> > >>> > > > > > > > > > Kowshik
>> > >>> > > > > > > > > >
>> > >>> > > > > > > > >
>> > >>> > > > > > > >
>> > >>> > > > > > > >
>> > >>> > > > > > > > --
>> > >>> > > > > > > > -- Guozhang
>> > >>> > > > > > > >
>> > >>> > > > > > >
>> > >>> > > > > >
>> > >>> > > > >
>> > >>> > > >
>> > >>> > >
>> > >>> >
>> > >>>
>> > >>
>> >
>>
>

Reply via email to