Hi, Colin, Yes, #3 is the scenario that I was thinking about.
In either approach, there will be some information missing in the old client. It seems that we should just pick the one that's less wrong. In the more common case when a feature is finalized on the server, presenting a supported feature with a range of 1-1 seems less wrong than omitting it in the output of "kafka-features describe". Thanks, Jun On Thu, Jun 27, 2024 at 9:52 PM Colin McCabe <cmcc...@apache.org> wrote: > Hi Jun, > > This is a fair question. I think there's a few different scenarios to > consider: > > 1. mixed server software versions in a single cluster > > 2. new client software + old server software > > 3. old client software + new server software > > In scenario #1 and #2, we have old (pre-3.9) server software in the mix. > This old software won't support features like group.version and > kraft.version. As we know, there are no features supported in 3.8 and older > except metadata.version itself. So the fact that we leave out some stuff > from the ApiVersionResponse isn't terribly significant. We weren't going to > be able to enable those post-3.8 features anyway, since enabling a feature > requires ALL server nodes to support it. > > Scenario #3 is more interesting. With new server software, features like > group.version and kraft.version may be enabled. But due to the KAFKA-17011 > bug, we cannot accurately communicate the supported feature range back to > the old client. > > What is the impact of this? It depends on what the client is. Today, the > only client that cares about feature versions is admin client, which can > surface them through the Admin.describeFeatures API. So if we omit the > supported feature range, admi client won't report it. If we fudge it by > reporting it as 1-1 instead of 0-1, admin client will report the fudged > version. > > In theory, there could be other clients looking at the supported feature > ranges later, but I guess those will be post-3.8, if they ever exist, and > so not subject to this problem. > > AdminClient returns a separate map for "supported features" and "finalized > features." So leaving out the supported versions for group.version and > kraft.version will not prevent the client from returning the finalized > versions of those features to the old client. > > So basically we have a choice between missing information in > Admin.describeFeatures and wrong information. I would lean towards the > missing information path, but I guess we should try out an old build of > kafka-features.sh against a server with one of the new features enabled, to > make sure it looks the way we want. > > best, > Colin > > > On Thu, Jun 27, 2024, at 14:01, Jun Rao wrote: > > Hi, Colin, > > > > ApiVersionResponse includes both supported and finalized features. If we > > only suppress features in the supported field, but not in the finalized > > field, it can potentially lead to inconsistency in the older client. For > > example, if a future feature supporting V0 is finalized in the broker, an > > old client issuing V3 of ApiVersionRequest will see the feature in the > > finalized field, but not in the supported field. > > > > An alternative approach is to still include all features in the supported > > field, but replace minVersion of 0 with 1. This may still lead to > > inconsistency if a future feature is finalized at version 0. However, > since > > downgrading is less frequent than upgrading, this approach seems slightly > > more consistent. > > > > No matter what approach we take, it would be useful to document this > > inconsistency to the old client. > > > > Thanks, > > > > Jun > > > > On Wed, Jun 26, 2024 at 1:18 PM Jun Rao <j...@confluent.io> wrote: > > > >> Thanks for the reply, Justine and Colin. Sounds good to me. > >> > >> Jun > >> > >> On Wed, Jun 26, 2024 at 12:54 PM Colin McCabe <cmcc...@apache.org> > wrote: > >> > >>> Hi Justine, > >>> > >>> Yes, that was what I was thinking. > >>> > >>> best, > >>> Colin > >>> > >>> On Mon, Jun 24, 2024, at 11:11, Justine Olshan wrote: > >>> > My understanding is that the tools that don't rely on ApiVersions > should > >>> > still return 0s when it is the correct value. I believe these > commands > >>> do > >>> > not require this API and thus can show 0 as versions. > >>> > > >>> > Likewise, when the old ApiVersionsRequest is used to describe > features, > >>> we > >>> > can't return 0 versions and we won't be able to see group version > set. > >>> > However, the new api will return 0 and the group version correctly. > >>> > > >>> > Let me know if this is consistent with your thoughts, Colin. > >>> > > >>> > Justine > >>> > > >>> > On Mon, Jun 24, 2024 at 10:44 AM Jun Rao <j...@confluent.io.invalid> > >>> wrote: > >>> > > >>> >> Hi, Colin, > >>> >> > >>> >> Thanks for the update. The proposed change seems reasonable to me. > >>> Just one > >>> >> clarification. > >>> >> > >>> >> The KIP can show version 0 of certain features with version-mapping > >>> >> and feature-dependencies. Will that part change? For example, will > the > >>> tool > >>> >> show version 0 features with --release-version 3.8 or do we exclude > >>> them. > >>> >> > >>> >> bin/kafka-storage.sh version-mapping --release-version 3.6-IV1 > >>> >> metadata.version=13 (3.6-IV1) transaction.version=0 > >>> group.version=0 > >>> >> kraft.version=0 > >>> >> > >>> >> Jun > >>> >> > >>> >> On Sat, Jun 22, 2024 at 2:19 PM José Armando García Sancio > >>> >> <jsan...@confluent.io.invalid> wrote: > >>> >> > >>> >> > Thanks for the update Colin. The changes make sense to me. > >>> >> > > >>> >> > Are you planning to update the KIP to reflect this new RPC > version? > >>> It > >>> >> > would be good to document the semantics explained above in the > KIP. > >>> >> > > >>> >> > Thanks! > >>> >> > > >>> >> > On Fri, Jun 21, 2024 at 8:22 PM Justine Olshan > >>> >> > <jols...@confluent.io.invalid> wrote: > >>> >> > > > >>> >> > > Ok makes sense. I will update my PR. > >>> >> > > > >>> >> > > On Fri, Jun 21, 2024 at 5:09 PM Colin McCabe < > cmcc...@apache.org> > >>> >> wrote: > >>> >> > > > >>> >> > > > I think it's better to suppress the response in v3. The issue > >>> with > >>> >> > > > modifying it is that there may be scenarios where [1, 1] is > the > >>> >> actual > >>> >> > > > supported range, and we'd want to know that. But leaving out > the > >>> >> > feature > >>> >> > > > should be OK for older clients (it will be the case with > clients > >>> old > >>> >> > enough > >>> >> > > > to send a v0, v1, or v2 ApiVersionsRequest anyway) > >>> >> > > > > >>> >> > > > best, > >>> >> > > > Colin > >>> >> > > > > >>> >> > > > On Fri, Jun 21, 2024, at 16:46, Justine Olshan wrote: > >>> >> > > > > Thanks Colin, > >>> >> > > > > > >>> >> > > > > This makes sense to me. Namely in the case where we perhaps > >>> don't > >>> >> > want to > >>> >> > > > > support version 0 anymore, we need the range to be able to > not > >>> >> > include 0. > >>> >> > > > > (In other words, we can't assume 0 is supported) > >>> >> > > > > It is unfortunate that this change is a bit tricky, but I > think > >>> >> it's > >>> >> > the > >>> >> > > > > best option. > >>> >> > > > > > >>> >> > > > > Can you clarify > >>> >> > > > >> The server will simply leave out the features whose minimum > >>> >> > supported > >>> >> > > > > value is 0 for clients that send v3 > >>> >> > > > > > >>> >> > > > > For 3.8, I planned to set the 0s in the response to 1. Is it > >>> better > >>> >> > to > >>> >> > > > > suppress the zero version features in the response so we are > >>> >> > consistent > >>> >> > > > > between trunk and 3.8? > >>> >> > > > > > >>> >> > > > > Thanks, > >>> >> > > > > Justine > >>> >> > > > > > >>> >> > > > > On Fri, Jun 21, 2024 at 4:34 PM Colin McCabe < > >>> cmcc...@apache.org> > >>> >> > wrote: > >>> >> > > > > > >>> >> > > > >> Hi all, > >>> >> > > > >> > >>> >> > > > >> It seems that there was a bug in older versions of Kafka > which > >>> >> > caused > >>> >> > > > >> deserialization problems when a supported feature range > >>> included > >>> >> 0. > >>> >> > For > >>> >> > > > >> example, the range for group.version of [0, 1] would be a > >>> problem > >>> >> in > >>> >> > > > this > >>> >> > > > >> situation. > >>> >> > > > >> > >>> >> > > > >> This obviously makes supportedVersions kind of useless. Any > >>> >> feature > >>> >> > that > >>> >> > > > >> doesn't exist today is effectively at v0 today (v0 is > >>> equivalent > >>> >> to > >>> >> > > > "off"). > >>> >> > > > >> But if we can't declare that the server supports [0, 1] or > >>> >> similar, > >>> >> > we > >>> >> > > > >> can't declare that it supports the feature being off. > >>> Therefore, > >>> >> no > >>> >> > > > rolling > >>> >> > > > >> upgrades are possible. > >>> >> > > > >> > >>> >> > > > >> We noticed this bug during the 3.8 release when we noticed > >>> >> problems > >>> >> > in > >>> >> > > > >> upgrade tests. As an addendum to KIP-1022, we're adding the > >>> >> > following > >>> >> > > > >> solution: > >>> >> > > > >> > >>> >> > > > >> - There will be a new v4 for ApiVersionsRequest > >>> >> > > > >> > >>> >> > > > >> - Clients that sent v4 will promise to correctly handle > ranges > >>> >> that > >>> >> > > > start > >>> >> > > > >> with 0, such as [0, 1] > >>> >> > > > >> > >>> >> > > > >> - The server will simply leave out the features whose > minimum > >>> >> > supported > >>> >> > > > >> value is 0 for clients that send v3 > >>> >> > > > >> > >>> >> > > > >> - ApiVersionsRequest v4 will be supported in AK 3.9 and > >>> above. AK > >>> >> > 3.8 > >>> >> > > > will > >>> >> > > > >> ship with ApiVersionsRequest v3 just as today. > >>> >> > > > >> > >>> >> > > > >> thanks, > >>> >> > > > >> Colin > >>> >> > > > >> > >>> >> > > > >> > >>> >> > > > >> On Mon, Apr 15, 2024, at 11:01, Justine Olshan wrote: > >>> >> > > > >> > Hey folks, > >>> >> > > > >> > > >>> >> > > > >> > Thanks everyone! I will go ahead and call it. > >>> >> > > > >> > The KIP passes with the following +1 votes: > >>> >> > > > >> > > >>> >> > > > >> > - Andrew Schofield (non-binding) > >>> >> > > > >> > - David Jacot (binding) > >>> >> > > > >> > - José Armando García Sancio (binding) > >>> >> > > > >> > - Jun Rao (binding) > >>> >> > > > >> > > >>> >> > > > >> > Thanks again, > >>> >> > > > >> > Justine > >>> >> > > > >> > > >>> >> > > > >> > On Fri, Apr 12, 2024 at 11:16 AM Jun Rao > >>> >> <j...@confluent.io.invalid > >>> >> > > > >>> >> > > > >> wrote: > >>> >> > > > >> > > >>> >> > > > >> >> Hi, Justine, > >>> >> > > > >> >> > >>> >> > > > >> >> Thanks for the KIP. +1 > >>> >> > > > >> >> > >>> >> > > > >> >> Jun > >>> >> > > > >> >> > >>> >> > > > >> >> On Wed, Apr 10, 2024 at 9:13 AM José Armando García > Sancio > >>> >> > > > >> >> <jsan...@confluent.io.invalid> wrote: > >>> >> > > > >> >> > >>> >> > > > >> >> > Hi Justine, > >>> >> > > > >> >> > > >>> >> > > > >> >> > +1 (binding) > >>> >> > > > >> >> > > >>> >> > > > >> >> > Thanks for the improvement. > >>> >> > > > >> >> > -- > >>> >> > > > >> >> > -José > >>> >> > > > >> >> > > >>> >> > > > >> >> > >>> >> > > > >> > >>> >> > > > > >>> >> > > >>> >> > > >>> >> > > >>> >> > -- > >>> >> > -José > >>> >> > > >>> >> > >>> > >> >