Hi Jun, Colin,

Have we considered sticking with the range going from version 1 to N where
version 1 would be the equivalent of "disabled"? In the group.version case,
we could introduce group.version=1 that does basically nothing and
group.version=2 that enables the new protocol. I suppose that we could do
the same for the other features. I agree that it is less elegant but it
would avoid all the backward compatibility issues.

Best,
David

On Fri, Jun 28, 2024 at 6:02 PM Jun Rao <j...@confluent.io.invalid> wrote:

> Hi, Colin,
>
> Yes, #3 is the scenario that I was thinking about.
>
> In either approach, there will be some information missing in the old
> client. It seems that we should just pick the one that's less wrong. In the
> more common case when a feature is finalized on the server, presenting a
> supported feature with a range of 1-1 seems less wrong than omitting it in
> the output of "kafka-features describe".
>
> Thanks,
>
> Jun
>
> On Thu, Jun 27, 2024 at 9:52 PM Colin McCabe <cmcc...@apache.org> wrote:
>
> > Hi Jun,
> >
> > This is a fair question. I think there's a few different scenarios to
> > consider:
> >
> > 1. mixed server software versions in a single cluster
> >
> > 2. new client software + old server software
> >
> > 3. old client software + new server software
> >
> > In scenario #1 and #2, we have old (pre-3.9) server software in the mix.
> > This old software won't support features like group.version and
> > kraft.version. As we know, there are no features supported in 3.8 and
> older
> > except metadata.version itself. So the fact that we leave out some stuff
> > from the ApiVersionResponse isn't terribly significant. We weren't going
> to
> > be able to enable those post-3.8 features anyway, since enabling a
> feature
> > requires ALL server nodes to support it.
> >
> > Scenario #3 is more interesting. With new server software, features like
> > group.version and kraft.version may be enabled. But due to the
> KAFKA-17011
> > bug, we cannot accurately communicate the supported feature range back to
> > the old client.
> >
> > What is the impact of this? It depends on what the client is. Today, the
> > only client that cares about feature versions is admin client, which can
> > surface them through the Admin.describeFeatures API. So if we omit the
> > supported feature range, admi client won't report it. If we fudge it by
> > reporting it as 1-1 instead of 0-1, admin client will report the fudged
> > version.
> >
> > In theory, there could be other clients looking at the supported feature
> > ranges later, but I guess those will be post-3.8, if they ever exist, and
> > so not subject to this problem.
> >
> > AdminClient returns a separate map for "supported features" and
> "finalized
> > features." So leaving out the supported versions for group.version and
> > kraft.version will not prevent the client from returning the finalized
> > versions of those features to the old client.
> >
> > So basically we have a choice between missing information in
> > Admin.describeFeatures and wrong information. I would lean towards the
> > missing information path, but I guess we should try out an old build of
> > kafka-features.sh against a server with one of the new features enabled,
> to
> > make sure it looks the way we want.
> >
> > best,
> > Colin
> >
> >
> > On Thu, Jun 27, 2024, at 14:01, Jun Rao wrote:
> > > Hi, Colin,
> > >
> > > ApiVersionResponse includes both supported and finalized features. If
> we
> > > only suppress features in the supported field, but not in the finalized
> > > field, it can potentially lead to inconsistency in the older client.
> For
> > > example, if a future feature supporting V0 is finalized in the broker,
> an
> > > old client issuing V3 of ApiVersionRequest will see the feature in the
> > > finalized field, but not in the supported field.
> > >
> > > An alternative approach is to still include all features in the
> supported
> > > field, but replace minVersion of 0 with 1. This may still lead to
> > > inconsistency if a future feature is finalized at version 0. However,
> > since
> > > downgrading is less frequent than upgrading, this approach seems
> slightly
> > > more consistent.
> > >
> > > No matter what approach we take, it would be useful to document this
> > > inconsistency to the old client.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Wed, Jun 26, 2024 at 1:18 PM Jun Rao <j...@confluent.io> wrote:
> > >
> > >> Thanks for the reply, Justine and Colin. Sounds good to me.
> > >>
> > >> Jun
> > >>
> > >> On Wed, Jun 26, 2024 at 12:54 PM Colin McCabe <cmcc...@apache.org>
> > wrote:
> > >>
> > >>> Hi Justine,
> > >>>
> > >>> Yes, that was what I was thinking.
> > >>>
> > >>> best,
> > >>> Colin
> > >>>
> > >>> On Mon, Jun 24, 2024, at 11:11, Justine Olshan wrote:
> > >>> > My understanding is that the tools that don't rely on ApiVersions
> > should
> > >>> > still return 0s when it is the correct value. I believe these
> > commands
> > >>> do
> > >>> > not require this API and thus can show 0 as versions.
> > >>> >
> > >>> > Likewise, when the old ApiVersionsRequest is used to describe
> > features,
> > >>> we
> > >>> > can't return 0 versions and we won't be able to see group version
> > set.
> > >>> > However, the new api will return 0 and the group version correctly.
> > >>> >
> > >>> > Let me know if this is consistent with your thoughts, Colin.
> > >>> >
> > >>> > Justine
> > >>> >
> > >>> > On Mon, Jun 24, 2024 at 10:44 AM Jun Rao <j...@confluent.io.invalid
> >
> > >>> wrote:
> > >>> >
> > >>> >> Hi, Colin,
> > >>> >>
> > >>> >> Thanks for the update. The proposed change seems reasonable to me.
> > >>> Just one
> > >>> >> clarification.
> > >>> >>
> > >>> >> The KIP can show version 0 of certain features with
> version-mapping
> > >>> >> and feature-dependencies. Will that part change? For example, will
> > the
> > >>> tool
> > >>> >> show version 0 features with --release-version 3.8 or do we
> exclude
> > >>> them.
> > >>> >>
> > >>> >> bin/kafka-storage.sh version-mapping --release-version 3.6-IV1
> > >>> >>     metadata.version=13 (3.6-IV1)  transaction.version=0
> > >>> group.version=0
> > >>> >>     kraft.version=0
> > >>> >>
> > >>> >> Jun
> > >>> >>
> > >>> >> On Sat, Jun 22, 2024 at 2:19 PM José Armando García Sancio
> > >>> >> <jsan...@confluent.io.invalid> wrote:
> > >>> >>
> > >>> >> > Thanks for the update Colin. The changes make sense to me.
> > >>> >> >
> > >>> >> > Are you planning to update the KIP to reflect this new RPC
> > version?
> > >>> It
> > >>> >> > would be good to document the semantics explained above in the
> > KIP.
> > >>> >> >
> > >>> >> > Thanks!
> > >>> >> >
> > >>> >> > On Fri, Jun 21, 2024 at 8:22 PM Justine Olshan
> > >>> >> > <jols...@confluent.io.invalid> wrote:
> > >>> >> > >
> > >>> >> > > Ok makes sense. I will update my PR.
> > >>> >> > >
> > >>> >> > > On Fri, Jun 21, 2024 at 5:09 PM Colin McCabe <
> > cmcc...@apache.org>
> > >>> >> wrote:
> > >>> >> > >
> > >>> >> > > > I think it's better to suppress the response in v3. The
> issue
> > >>> with
> > >>> >> > > > modifying it is that there may be scenarios where [1, 1] is
> > the
> > >>> >> actual
> > >>> >> > > > supported range, and we'd want to know that. But leaving out
> > the
> > >>> >> > feature
> > >>> >> > > > should be OK for older clients (it will be the case with
> > clients
> > >>> old
> > >>> >> > enough
> > >>> >> > > > to send a v0, v1, or v2 ApiVersionsRequest anyway)
> > >>> >> > > >
> > >>> >> > > > best,
> > >>> >> > > > Colin
> > >>> >> > > >
> > >>> >> > > > On Fri, Jun 21, 2024, at 16:46, Justine Olshan wrote:
> > >>> >> > > > > Thanks Colin,
> > >>> >> > > > >
> > >>> >> > > > > This makes sense to me. Namely in the case where we
> perhaps
> > >>> don't
> > >>> >> > want to
> > >>> >> > > > > support version 0 anymore, we need the range to be able to
> > not
> > >>> >> > include 0.
> > >>> >> > > > > (In other words, we can't assume 0 is supported)
> > >>> >> > > > > It is unfortunate that this change is a bit tricky, but I
> > think
> > >>> >> it's
> > >>> >> > the
> > >>> >> > > > > best option.
> > >>> >> > > > >
> > >>> >> > > > > Can you clarify
> > >>> >> > > > >> The server will simply leave out the features whose
> minimum
> > >>> >> > supported
> > >>> >> > > > > value is 0 for clients that send v3
> > >>> >> > > > >
> > >>> >> > > > > For 3.8, I planned to set the 0s in the response to 1. Is
> it
> > >>> better
> > >>> >> > to
> > >>> >> > > > > suppress the zero version features in the response so we
> are
> > >>> >> > consistent
> > >>> >> > > > > between trunk and 3.8?
> > >>> >> > > > >
> > >>> >> > > > > Thanks,
> > >>> >> > > > > Justine
> > >>> >> > > > >
> > >>> >> > > > > On Fri, Jun 21, 2024 at 4:34 PM Colin McCabe <
> > >>> cmcc...@apache.org>
> > >>> >> > wrote:
> > >>> >> > > > >
> > >>> >> > > > >> Hi all,
> > >>> >> > > > >>
> > >>> >> > > > >> It seems that there was a bug in older versions of Kafka
> > which
> > >>> >> > caused
> > >>> >> > > > >> deserialization problems when a supported feature range
> > >>> included
> > >>> >> 0.
> > >>> >> > For
> > >>> >> > > > >> example, the range for group.version of [0, 1] would be a
> > >>> problem
> > >>> >> in
> > >>> >> > > > this
> > >>> >> > > > >> situation.
> > >>> >> > > > >>
> > >>> >> > > > >> This obviously makes supportedVersions kind of useless.
> Any
> > >>> >> feature
> > >>> >> > that
> > >>> >> > > > >> doesn't exist today is effectively at v0 today (v0 is
> > >>> equivalent
> > >>> >> to
> > >>> >> > > > "off").
> > >>> >> > > > >> But if we can't declare that the server supports [0, 1]
> or
> > >>> >> similar,
> > >>> >> > we
> > >>> >> > > > >> can't declare that it supports the feature being off.
> > >>> Therefore,
> > >>> >> no
> > >>> >> > > > rolling
> > >>> >> > > > >> upgrades are possible.
> > >>> >> > > > >>
> > >>> >> > > > >> We noticed this bug during the 3.8 release when we
> noticed
> > >>> >> problems
> > >>> >> > in
> > >>> >> > > > >> upgrade tests. As an addendum to KIP-1022, we're adding
> the
> > >>> >> > following
> > >>> >> > > > >> solution:
> > >>> >> > > > >>
> > >>> >> > > > >> - There will be a new v4 for ApiVersionsRequest
> > >>> >> > > > >>
> > >>> >> > > > >> - Clients that sent v4 will promise to correctly handle
> > ranges
> > >>> >> that
> > >>> >> > > > start
> > >>> >> > > > >> with 0, such as [0, 1]
> > >>> >> > > > >>
> > >>> >> > > > >> - The server will simply leave out the features whose
> > minimum
> > >>> >> > supported
> > >>> >> > > > >> value is 0 for clients that send v3
> > >>> >> > > > >>
> > >>> >> > > > >> - ApiVersionsRequest v4 will be supported in AK 3.9 and
> > >>> above. AK
> > >>> >> > 3.8
> > >>> >> > > > will
> > >>> >> > > > >> ship with ApiVersionsRequest v3 just as today.
> > >>> >> > > > >>
> > >>> >> > > > >> thanks,
> > >>> >> > > > >> Colin
> > >>> >> > > > >>
> > >>> >> > > > >>
> > >>> >> > > > >> On Mon, Apr 15, 2024, at 11:01, Justine Olshan wrote:
> > >>> >> > > > >> > Hey folks,
> > >>> >> > > > >> >
> > >>> >> > > > >> > Thanks everyone! I will go ahead and call it.
> > >>> >> > > > >> > The KIP passes with the following +1 votes:
> > >>> >> > > > >> >
> > >>> >> > > > >> > - Andrew Schofield (non-binding)
> > >>> >> > > > >> > - David Jacot (binding)
> > >>> >> > > > >> > - José Armando García Sancio (binding)
> > >>> >> > > > >> > - Jun Rao (binding)
> > >>> >> > > > >> >
> > >>> >> > > > >> > Thanks again,
> > >>> >> > > > >> > Justine
> > >>> >> > > > >> >
> > >>> >> > > > >> > On Fri, Apr 12, 2024 at 11:16 AM Jun Rao
> > >>> >> <j...@confluent.io.invalid
> > >>> >> > >
> > >>> >> > > > >> wrote:
> > >>> >> > > > >> >
> > >>> >> > > > >> >> Hi, Justine,
> > >>> >> > > > >> >>
> > >>> >> > > > >> >> Thanks for the KIP. +1
> > >>> >> > > > >> >>
> > >>> >> > > > >> >> Jun
> > >>> >> > > > >> >>
> > >>> >> > > > >> >> On Wed, Apr 10, 2024 at 9:13 AM José Armando García
> > Sancio
> > >>> >> > > > >> >> <jsan...@confluent.io.invalid> wrote:
> > >>> >> > > > >> >>
> > >>> >> > > > >> >> > Hi Justine,
> > >>> >> > > > >> >> >
> > >>> >> > > > >> >> > +1 (binding)
> > >>> >> > > > >> >> >
> > >>> >> > > > >> >> > Thanks for the improvement.
> > >>> >> > > > >> >> > --
> > >>> >> > > > >> >> > -José
> > >>> >> > > > >> >> >
> > >>> >> > > > >> >>
> > >>> >> > > > >>
> > >>> >> > > >
> > >>> >> >
> > >>> >> >
> > >>> >> >
> > >>> >> > --
> > >>> >> > -José
> > >>> >> >
> > >>> >>
> > >>>
> > >>
> >
>

Reply via email to