Hi Jun, Colin, Have we considered sticking with the range going from version 1 to N where version 1 would be the equivalent of "disabled"? In the group.version case, we could introduce group.version=1 that does basically nothing and group.version=2 that enables the new protocol. I suppose that we could do the same for the other features. I agree that it is less elegant but it would avoid all the backward compatibility issues.
Best, David On Fri, Jun 28, 2024 at 6:02 PM Jun Rao <j...@confluent.io.invalid> wrote: > Hi, Colin, > > Yes, #3 is the scenario that I was thinking about. > > In either approach, there will be some information missing in the old > client. It seems that we should just pick the one that's less wrong. In the > more common case when a feature is finalized on the server, presenting a > supported feature with a range of 1-1 seems less wrong than omitting it in > the output of "kafka-features describe". > > Thanks, > > Jun > > On Thu, Jun 27, 2024 at 9:52 PM Colin McCabe <cmcc...@apache.org> wrote: > > > Hi Jun, > > > > This is a fair question. I think there's a few different scenarios to > > consider: > > > > 1. mixed server software versions in a single cluster > > > > 2. new client software + old server software > > > > 3. old client software + new server software > > > > In scenario #1 and #2, we have old (pre-3.9) server software in the mix. > > This old software won't support features like group.version and > > kraft.version. As we know, there are no features supported in 3.8 and > older > > except metadata.version itself. So the fact that we leave out some stuff > > from the ApiVersionResponse isn't terribly significant. We weren't going > to > > be able to enable those post-3.8 features anyway, since enabling a > feature > > requires ALL server nodes to support it. > > > > Scenario #3 is more interesting. With new server software, features like > > group.version and kraft.version may be enabled. But due to the > KAFKA-17011 > > bug, we cannot accurately communicate the supported feature range back to > > the old client. > > > > What is the impact of this? It depends on what the client is. Today, the > > only client that cares about feature versions is admin client, which can > > surface them through the Admin.describeFeatures API. So if we omit the > > supported feature range, admi client won't report it. If we fudge it by > > reporting it as 1-1 instead of 0-1, admin client will report the fudged > > version. > > > > In theory, there could be other clients looking at the supported feature > > ranges later, but I guess those will be post-3.8, if they ever exist, and > > so not subject to this problem. > > > > AdminClient returns a separate map for "supported features" and > "finalized > > features." So leaving out the supported versions for group.version and > > kraft.version will not prevent the client from returning the finalized > > versions of those features to the old client. > > > > So basically we have a choice between missing information in > > Admin.describeFeatures and wrong information. I would lean towards the > > missing information path, but I guess we should try out an old build of > > kafka-features.sh against a server with one of the new features enabled, > to > > make sure it looks the way we want. > > > > best, > > Colin > > > > > > On Thu, Jun 27, 2024, at 14:01, Jun Rao wrote: > > > Hi, Colin, > > > > > > ApiVersionResponse includes both supported and finalized features. If > we > > > only suppress features in the supported field, but not in the finalized > > > field, it can potentially lead to inconsistency in the older client. > For > > > example, if a future feature supporting V0 is finalized in the broker, > an > > > old client issuing V3 of ApiVersionRequest will see the feature in the > > > finalized field, but not in the supported field. > > > > > > An alternative approach is to still include all features in the > supported > > > field, but replace minVersion of 0 with 1. This may still lead to > > > inconsistency if a future feature is finalized at version 0. However, > > since > > > downgrading is less frequent than upgrading, this approach seems > slightly > > > more consistent. > > > > > > No matter what approach we take, it would be useful to document this > > > inconsistency to the old client. > > > > > > Thanks, > > > > > > Jun > > > > > > On Wed, Jun 26, 2024 at 1:18 PM Jun Rao <j...@confluent.io> wrote: > > > > > >> Thanks for the reply, Justine and Colin. Sounds good to me. > > >> > > >> Jun > > >> > > >> On Wed, Jun 26, 2024 at 12:54 PM Colin McCabe <cmcc...@apache.org> > > wrote: > > >> > > >>> Hi Justine, > > >>> > > >>> Yes, that was what I was thinking. > > >>> > > >>> best, > > >>> Colin > > >>> > > >>> On Mon, Jun 24, 2024, at 11:11, Justine Olshan wrote: > > >>> > My understanding is that the tools that don't rely on ApiVersions > > should > > >>> > still return 0s when it is the correct value. I believe these > > commands > > >>> do > > >>> > not require this API and thus can show 0 as versions. > > >>> > > > >>> > Likewise, when the old ApiVersionsRequest is used to describe > > features, > > >>> we > > >>> > can't return 0 versions and we won't be able to see group version > > set. > > >>> > However, the new api will return 0 and the group version correctly. > > >>> > > > >>> > Let me know if this is consistent with your thoughts, Colin. > > >>> > > > >>> > Justine > > >>> > > > >>> > On Mon, Jun 24, 2024 at 10:44 AM Jun Rao <j...@confluent.io.invalid > > > > >>> wrote: > > >>> > > > >>> >> Hi, Colin, > > >>> >> > > >>> >> Thanks for the update. The proposed change seems reasonable to me. > > >>> Just one > > >>> >> clarification. > > >>> >> > > >>> >> The KIP can show version 0 of certain features with > version-mapping > > >>> >> and feature-dependencies. Will that part change? For example, will > > the > > >>> tool > > >>> >> show version 0 features with --release-version 3.8 or do we > exclude > > >>> them. > > >>> >> > > >>> >> bin/kafka-storage.sh version-mapping --release-version 3.6-IV1 > > >>> >> metadata.version=13 (3.6-IV1) transaction.version=0 > > >>> group.version=0 > > >>> >> kraft.version=0 > > >>> >> > > >>> >> Jun > > >>> >> > > >>> >> On Sat, Jun 22, 2024 at 2:19 PM José Armando García Sancio > > >>> >> <jsan...@confluent.io.invalid> wrote: > > >>> >> > > >>> >> > Thanks for the update Colin. The changes make sense to me. > > >>> >> > > > >>> >> > Are you planning to update the KIP to reflect this new RPC > > version? > > >>> It > > >>> >> > would be good to document the semantics explained above in the > > KIP. > > >>> >> > > > >>> >> > Thanks! > > >>> >> > > > >>> >> > On Fri, Jun 21, 2024 at 8:22 PM Justine Olshan > > >>> >> > <jols...@confluent.io.invalid> wrote: > > >>> >> > > > > >>> >> > > Ok makes sense. I will update my PR. > > >>> >> > > > > >>> >> > > On Fri, Jun 21, 2024 at 5:09 PM Colin McCabe < > > cmcc...@apache.org> > > >>> >> wrote: > > >>> >> > > > > >>> >> > > > I think it's better to suppress the response in v3. The > issue > > >>> with > > >>> >> > > > modifying it is that there may be scenarios where [1, 1] is > > the > > >>> >> actual > > >>> >> > > > supported range, and we'd want to know that. But leaving out > > the > > >>> >> > feature > > >>> >> > > > should be OK for older clients (it will be the case with > > clients > > >>> old > > >>> >> > enough > > >>> >> > > > to send a v0, v1, or v2 ApiVersionsRequest anyway) > > >>> >> > > > > > >>> >> > > > best, > > >>> >> > > > Colin > > >>> >> > > > > > >>> >> > > > On Fri, Jun 21, 2024, at 16:46, Justine Olshan wrote: > > >>> >> > > > > Thanks Colin, > > >>> >> > > > > > > >>> >> > > > > This makes sense to me. Namely in the case where we > perhaps > > >>> don't > > >>> >> > want to > > >>> >> > > > > support version 0 anymore, we need the range to be able to > > not > > >>> >> > include 0. > > >>> >> > > > > (In other words, we can't assume 0 is supported) > > >>> >> > > > > It is unfortunate that this change is a bit tricky, but I > > think > > >>> >> it's > > >>> >> > the > > >>> >> > > > > best option. > > >>> >> > > > > > > >>> >> > > > > Can you clarify > > >>> >> > > > >> The server will simply leave out the features whose > minimum > > >>> >> > supported > > >>> >> > > > > value is 0 for clients that send v3 > > >>> >> > > > > > > >>> >> > > > > For 3.8, I planned to set the 0s in the response to 1. Is > it > > >>> better > > >>> >> > to > > >>> >> > > > > suppress the zero version features in the response so we > are > > >>> >> > consistent > > >>> >> > > > > between trunk and 3.8? > > >>> >> > > > > > > >>> >> > > > > Thanks, > > >>> >> > > > > Justine > > >>> >> > > > > > > >>> >> > > > > On Fri, Jun 21, 2024 at 4:34 PM Colin McCabe < > > >>> cmcc...@apache.org> > > >>> >> > wrote: > > >>> >> > > > > > > >>> >> > > > >> Hi all, > > >>> >> > > > >> > > >>> >> > > > >> It seems that there was a bug in older versions of Kafka > > which > > >>> >> > caused > > >>> >> > > > >> deserialization problems when a supported feature range > > >>> included > > >>> >> 0. > > >>> >> > For > > >>> >> > > > >> example, the range for group.version of [0, 1] would be a > > >>> problem > > >>> >> in > > >>> >> > > > this > > >>> >> > > > >> situation. > > >>> >> > > > >> > > >>> >> > > > >> This obviously makes supportedVersions kind of useless. > Any > > >>> >> feature > > >>> >> > that > > >>> >> > > > >> doesn't exist today is effectively at v0 today (v0 is > > >>> equivalent > > >>> >> to > > >>> >> > > > "off"). > > >>> >> > > > >> But if we can't declare that the server supports [0, 1] > or > > >>> >> similar, > > >>> >> > we > > >>> >> > > > >> can't declare that it supports the feature being off. > > >>> Therefore, > > >>> >> no > > >>> >> > > > rolling > > >>> >> > > > >> upgrades are possible. > > >>> >> > > > >> > > >>> >> > > > >> We noticed this bug during the 3.8 release when we > noticed > > >>> >> problems > > >>> >> > in > > >>> >> > > > >> upgrade tests. As an addendum to KIP-1022, we're adding > the > > >>> >> > following > > >>> >> > > > >> solution: > > >>> >> > > > >> > > >>> >> > > > >> - There will be a new v4 for ApiVersionsRequest > > >>> >> > > > >> > > >>> >> > > > >> - Clients that sent v4 will promise to correctly handle > > ranges > > >>> >> that > > >>> >> > > > start > > >>> >> > > > >> with 0, such as [0, 1] > > >>> >> > > > >> > > >>> >> > > > >> - The server will simply leave out the features whose > > minimum > > >>> >> > supported > > >>> >> > > > >> value is 0 for clients that send v3 > > >>> >> > > > >> > > >>> >> > > > >> - ApiVersionsRequest v4 will be supported in AK 3.9 and > > >>> above. AK > > >>> >> > 3.8 > > >>> >> > > > will > > >>> >> > > > >> ship with ApiVersionsRequest v3 just as today. > > >>> >> > > > >> > > >>> >> > > > >> thanks, > > >>> >> > > > >> Colin > > >>> >> > > > >> > > >>> >> > > > >> > > >>> >> > > > >> On Mon, Apr 15, 2024, at 11:01, Justine Olshan wrote: > > >>> >> > > > >> > Hey folks, > > >>> >> > > > >> > > > >>> >> > > > >> > Thanks everyone! I will go ahead and call it. > > >>> >> > > > >> > The KIP passes with the following +1 votes: > > >>> >> > > > >> > > > >>> >> > > > >> > - Andrew Schofield (non-binding) > > >>> >> > > > >> > - David Jacot (binding) > > >>> >> > > > >> > - José Armando García Sancio (binding) > > >>> >> > > > >> > - Jun Rao (binding) > > >>> >> > > > >> > > > >>> >> > > > >> > Thanks again, > > >>> >> > > > >> > Justine > > >>> >> > > > >> > > > >>> >> > > > >> > On Fri, Apr 12, 2024 at 11:16 AM Jun Rao > > >>> >> <j...@confluent.io.invalid > > >>> >> > > > > >>> >> > > > >> wrote: > > >>> >> > > > >> > > > >>> >> > > > >> >> Hi, Justine, > > >>> >> > > > >> >> > > >>> >> > > > >> >> Thanks for the KIP. +1 > > >>> >> > > > >> >> > > >>> >> > > > >> >> Jun > > >>> >> > > > >> >> > > >>> >> > > > >> >> On Wed, Apr 10, 2024 at 9:13 AM José Armando García > > Sancio > > >>> >> > > > >> >> <jsan...@confluent.io.invalid> wrote: > > >>> >> > > > >> >> > > >>> >> > > > >> >> > Hi Justine, > > >>> >> > > > >> >> > > > >>> >> > > > >> >> > +1 (binding) > > >>> >> > > > >> >> > > > >>> >> > > > >> >> > Thanks for the improvement. > > >>> >> > > > >> >> > -- > > >>> >> > > > >> >> > -José > > >>> >> > > > >> >> > > > >>> >> > > > >> >> > > >>> >> > > > >> > > >>> >> > > > > > >>> >> > > > >>> >> > > > >>> >> > > > >>> >> > -- > > >>> >> > -José > > >>> >> > > > >>> >> > > >>> > > >> > > >