Hi David, In the ApiVersionsResponse, we really don't have an easy way of mapping finalizedVersion = 1 to "off" in older releases such as 3.7.0. For example, if a 3.9.0 broker advertises that it has finalized group.version = 1, that will be treated by 3.7.0 as a brand new feature, not as "KIP-848 is off." However, I suppose we could work around this by not setting a finalizedVersion at all for group.version (or any other feature) if its finalized level was 1. We could also work around the "deletion = set to 0" issue on the server side. The server can translate requests to set the finalized level to 0, into requests to set it to 1.
So maybe this solution is worth considering, although it's unfortunate to lose 0. I suppose we'd have to special case metadata.version being set to 1, since that was NOT equivalent to it being "off" best, Colin On Mon, Jul 1, 2024, at 10:11, Jun Rao wrote: > Hi, David, > > Yes, that's another option. It probably has its own challenges. For > example, the FeatureCommand tool currently treats disabling a feature as > setting the version to 0. It would be useful to get Jose's opinion on this > since he introduced version 0 in the kraft.version feature. > > Thanks, > > Jun > > On Sun, Jun 30, 2024 at 11:48 PM David Jacot <dja...@confluent.io.invalid> > wrote: > >> Hi Jun, Colin, >> >> Have we considered sticking with the range going from version 1 to N where >> version 1 would be the equivalent of "disabled"? In the group.version case, >> we could introduce group.version=1 that does basically nothing and >> group.version=2 that enables the new protocol. I suppose that we could do >> the same for the other features. I agree that it is less elegant but it >> would avoid all the backward compatibility issues. >> >> Best, >> David >> >> On Fri, Jun 28, 2024 at 6:02 PM Jun Rao <j...@confluent.io.invalid> wrote: >> >> > Hi, Colin, >> > >> > Yes, #3 is the scenario that I was thinking about. >> > >> > In either approach, there will be some information missing in the old >> > client. It seems that we should just pick the one that's less wrong. In >> the >> > more common case when a feature is finalized on the server, presenting a >> > supported feature with a range of 1-1 seems less wrong than omitting it >> in >> > the output of "kafka-features describe". >> > >> > Thanks, >> > >> > Jun >> > >> > On Thu, Jun 27, 2024 at 9:52 PM Colin McCabe <cmcc...@apache.org> wrote: >> > >> > > Hi Jun, >> > > >> > > This is a fair question. I think there's a few different scenarios to >> > > consider: >> > > >> > > 1. mixed server software versions in a single cluster >> > > >> > > 2. new client software + old server software >> > > >> > > 3. old client software + new server software >> > > >> > > In scenario #1 and #2, we have old (pre-3.9) server software in the >> mix. >> > > This old software won't support features like group.version and >> > > kraft.version. As we know, there are no features supported in 3.8 and >> > older >> > > except metadata.version itself. So the fact that we leave out some >> stuff >> > > from the ApiVersionResponse isn't terribly significant. We weren't >> going >> > to >> > > be able to enable those post-3.8 features anyway, since enabling a >> > feature >> > > requires ALL server nodes to support it. >> > > >> > > Scenario #3 is more interesting. With new server software, features >> like >> > > group.version and kraft.version may be enabled. But due to the >> > KAFKA-17011 >> > > bug, we cannot accurately communicate the supported feature range back >> to >> > > the old client. >> > > >> > > What is the impact of this? It depends on what the client is. Today, >> the >> > > only client that cares about feature versions is admin client, which >> can >> > > surface them through the Admin.describeFeatures API. So if we omit the >> > > supported feature range, admi client won't report it. If we fudge it by >> > > reporting it as 1-1 instead of 0-1, admin client will report the fudged >> > > version. >> > > >> > > In theory, there could be other clients looking at the supported >> feature >> > > ranges later, but I guess those will be post-3.8, if they ever exist, >> and >> > > so not subject to this problem. >> > > >> > > AdminClient returns a separate map for "supported features" and >> > "finalized >> > > features." So leaving out the supported versions for group.version and >> > > kraft.version will not prevent the client from returning the finalized >> > > versions of those features to the old client. >> > > >> > > So basically we have a choice between missing information in >> > > Admin.describeFeatures and wrong information. I would lean towards the >> > > missing information path, but I guess we should try out an old build of >> > > kafka-features.sh against a server with one of the new features >> enabled, >> > to >> > > make sure it looks the way we want. >> > > >> > > best, >> > > Colin >> > > >> > > >> > > On Thu, Jun 27, 2024, at 14:01, Jun Rao wrote: >> > > > Hi, Colin, >> > > > >> > > > ApiVersionResponse includes both supported and finalized features. If >> > we >> > > > only suppress features in the supported field, but not in the >> finalized >> > > > field, it can potentially lead to inconsistency in the older client. >> > For >> > > > example, if a future feature supporting V0 is finalized in the >> broker, >> > an >> > > > old client issuing V3 of ApiVersionRequest will see the feature in >> the >> > > > finalized field, but not in the supported field. >> > > > >> > > > An alternative approach is to still include all features in the >> > supported >> > > > field, but replace minVersion of 0 with 1. This may still lead to >> > > > inconsistency if a future feature is finalized at version 0. However, >> > > since >> > > > downgrading is less frequent than upgrading, this approach seems >> > slightly >> > > > more consistent. >> > > > >> > > > No matter what approach we take, it would be useful to document this >> > > > inconsistency to the old client. >> > > > >> > > > Thanks, >> > > > >> > > > Jun >> > > > >> > > > On Wed, Jun 26, 2024 at 1:18 PM Jun Rao <j...@confluent.io> wrote: >> > > > >> > > >> Thanks for the reply, Justine and Colin. Sounds good to me. >> > > >> >> > > >> Jun >> > > >> >> > > >> On Wed, Jun 26, 2024 at 12:54 PM Colin McCabe <cmcc...@apache.org> >> > > wrote: >> > > >> >> > > >>> Hi Justine, >> > > >>> >> > > >>> Yes, that was what I was thinking. >> > > >>> >> > > >>> best, >> > > >>> Colin >> > > >>> >> > > >>> On Mon, Jun 24, 2024, at 11:11, Justine Olshan wrote: >> > > >>> > My understanding is that the tools that don't rely on ApiVersions >> > > should >> > > >>> > still return 0s when it is the correct value. I believe these >> > > commands >> > > >>> do >> > > >>> > not require this API and thus can show 0 as versions. >> > > >>> > >> > > >>> > Likewise, when the old ApiVersionsRequest is used to describe >> > > features, >> > > >>> we >> > > >>> > can't return 0 versions and we won't be able to see group version >> > > set. >> > > >>> > However, the new api will return 0 and the group version >> correctly. >> > > >>> > >> > > >>> > Let me know if this is consistent with your thoughts, Colin. >> > > >>> > >> > > >>> > Justine >> > > >>> > >> > > >>> > On Mon, Jun 24, 2024 at 10:44 AM Jun Rao >> <j...@confluent.io.invalid >> > > >> > > >>> wrote: >> > > >>> > >> > > >>> >> Hi, Colin, >> > > >>> >> >> > > >>> >> Thanks for the update. The proposed change seems reasonable to >> me. >> > > >>> Just one >> > > >>> >> clarification. >> > > >>> >> >> > > >>> >> The KIP can show version 0 of certain features with >> > version-mapping >> > > >>> >> and feature-dependencies. Will that part change? For example, >> will >> > > the >> > > >>> tool >> > > >>> >> show version 0 features with --release-version 3.8 or do we >> > exclude >> > > >>> them. >> > > >>> >> >> > > >>> >> bin/kafka-storage.sh version-mapping --release-version 3.6-IV1 >> > > >>> >> metadata.version=13 (3.6-IV1) transaction.version=0 >> > > >>> group.version=0 >> > > >>> >> kraft.version=0 >> > > >>> >> >> > > >>> >> Jun >> > > >>> >> >> > > >>> >> On Sat, Jun 22, 2024 at 2:19 PM José Armando García Sancio >> > > >>> >> <jsan...@confluent.io.invalid> wrote: >> > > >>> >> >> > > >>> >> > Thanks for the update Colin. The changes make sense to me. >> > > >>> >> > >> > > >>> >> > Are you planning to update the KIP to reflect this new RPC >> > > version? >> > > >>> It >> > > >>> >> > would be good to document the semantics explained above in the >> > > KIP. >> > > >>> >> > >> > > >>> >> > Thanks! >> > > >>> >> > >> > > >>> >> > On Fri, Jun 21, 2024 at 8:22 PM Justine Olshan >> > > >>> >> > <jols...@confluent.io.invalid> wrote: >> > > >>> >> > > >> > > >>> >> > > Ok makes sense. I will update my PR. >> > > >>> >> > > >> > > >>> >> > > On Fri, Jun 21, 2024 at 5:09 PM Colin McCabe < >> > > cmcc...@apache.org> >> > > >>> >> wrote: >> > > >>> >> > > >> > > >>> >> > > > I think it's better to suppress the response in v3. The >> > issue >> > > >>> with >> > > >>> >> > > > modifying it is that there may be scenarios where [1, 1] >> is >> > > the >> > > >>> >> actual >> > > >>> >> > > > supported range, and we'd want to know that. But leaving >> out >> > > the >> > > >>> >> > feature >> > > >>> >> > > > should be OK for older clients (it will be the case with >> > > clients >> > > >>> old >> > > >>> >> > enough >> > > >>> >> > > > to send a v0, v1, or v2 ApiVersionsRequest anyway) >> > > >>> >> > > > >> > > >>> >> > > > best, >> > > >>> >> > > > Colin >> > > >>> >> > > > >> > > >>> >> > > > On Fri, Jun 21, 2024, at 16:46, Justine Olshan wrote: >> > > >>> >> > > > > Thanks Colin, >> > > >>> >> > > > > >> > > >>> >> > > > > This makes sense to me. Namely in the case where we >> > perhaps >> > > >>> don't >> > > >>> >> > want to >> > > >>> >> > > > > support version 0 anymore, we need the range to be able >> to >> > > not >> > > >>> >> > include 0. >> > > >>> >> > > > > (In other words, we can't assume 0 is supported) >> > > >>> >> > > > > It is unfortunate that this change is a bit tricky, but >> I >> > > think >> > > >>> >> it's >> > > >>> >> > the >> > > >>> >> > > > > best option. >> > > >>> >> > > > > >> > > >>> >> > > > > Can you clarify >> > > >>> >> > > > >> The server will simply leave out the features whose >> > minimum >> > > >>> >> > supported >> > > >>> >> > > > > value is 0 for clients that send v3 >> > > >>> >> > > > > >> > > >>> >> > > > > For 3.8, I planned to set the 0s in the response to 1. >> Is >> > it >> > > >>> better >> > > >>> >> > to >> > > >>> >> > > > > suppress the zero version features in the response so we >> > are >> > > >>> >> > consistent >> > > >>> >> > > > > between trunk and 3.8? >> > > >>> >> > > > > >> > > >>> >> > > > > Thanks, >> > > >>> >> > > > > Justine >> > > >>> >> > > > > >> > > >>> >> > > > > On Fri, Jun 21, 2024 at 4:34 PM Colin McCabe < >> > > >>> cmcc...@apache.org> >> > > >>> >> > wrote: >> > > >>> >> > > > > >> > > >>> >> > > > >> Hi all, >> > > >>> >> > > > >> >> > > >>> >> > > > >> It seems that there was a bug in older versions of >> Kafka >> > > which >> > > >>> >> > caused >> > > >>> >> > > > >> deserialization problems when a supported feature range >> > > >>> included >> > > >>> >> 0. >> > > >>> >> > For >> > > >>> >> > > > >> example, the range for group.version of [0, 1] would >> be a >> > > >>> problem >> > > >>> >> in >> > > >>> >> > > > this >> > > >>> >> > > > >> situation. >> > > >>> >> > > > >> >> > > >>> >> > > > >> This obviously makes supportedVersions kind of useless. >> > Any >> > > >>> >> feature >> > > >>> >> > that >> > > >>> >> > > > >> doesn't exist today is effectively at v0 today (v0 is >> > > >>> equivalent >> > > >>> >> to >> > > >>> >> > > > "off"). >> > > >>> >> > > > >> But if we can't declare that the server supports [0, 1] >> > or >> > > >>> >> similar, >> > > >>> >> > we >> > > >>> >> > > > >> can't declare that it supports the feature being off. >> > > >>> Therefore, >> > > >>> >> no >> > > >>> >> > > > rolling >> > > >>> >> > > > >> upgrades are possible. >> > > >>> >> > > > >> >> > > >>> >> > > > >> We noticed this bug during the 3.8 release when we >> > noticed >> > > >>> >> problems >> > > >>> >> > in >> > > >>> >> > > > >> upgrade tests. As an addendum to KIP-1022, we're adding >> > the >> > > >>> >> > following >> > > >>> >> > > > >> solution: >> > > >>> >> > > > >> >> > > >>> >> > > > >> - There will be a new v4 for ApiVersionsRequest >> > > >>> >> > > > >> >> > > >>> >> > > > >> - Clients that sent v4 will promise to correctly handle >> > > ranges >> > > >>> >> that >> > > >>> >> > > > start >> > > >>> >> > > > >> with 0, such as [0, 1] >> > > >>> >> > > > >> >> > > >>> >> > > > >> - The server will simply leave out the features whose >> > > minimum >> > > >>> >> > supported >> > > >>> >> > > > >> value is 0 for clients that send v3 >> > > >>> >> > > > >> >> > > >>> >> > > > >> - ApiVersionsRequest v4 will be supported in AK 3.9 and >> > > >>> above. AK >> > > >>> >> > 3.8 >> > > >>> >> > > > will >> > > >>> >> > > > >> ship with ApiVersionsRequest v3 just as today. >> > > >>> >> > > > >> >> > > >>> >> > > > >> thanks, >> > > >>> >> > > > >> Colin >> > > >>> >> > > > >> >> > > >>> >> > > > >> >> > > >>> >> > > > >> On Mon, Apr 15, 2024, at 11:01, Justine Olshan wrote: >> > > >>> >> > > > >> > Hey folks, >> > > >>> >> > > > >> > >> > > >>> >> > > > >> > Thanks everyone! I will go ahead and call it. >> > > >>> >> > > > >> > The KIP passes with the following +1 votes: >> > > >>> >> > > > >> > >> > > >>> >> > > > >> > - Andrew Schofield (non-binding) >> > > >>> >> > > > >> > - David Jacot (binding) >> > > >>> >> > > > >> > - José Armando García Sancio (binding) >> > > >>> >> > > > >> > - Jun Rao (binding) >> > > >>> >> > > > >> > >> > > >>> >> > > > >> > Thanks again, >> > > >>> >> > > > >> > Justine >> > > >>> >> > > > >> > >> > > >>> >> > > > >> > On Fri, Apr 12, 2024 at 11:16 AM Jun Rao >> > > >>> >> <j...@confluent.io.invalid >> > > >>> >> > > >> > > >>> >> > > > >> wrote: >> > > >>> >> > > > >> > >> > > >>> >> > > > >> >> Hi, Justine, >> > > >>> >> > > > >> >> >> > > >>> >> > > > >> >> Thanks for the KIP. +1 >> > > >>> >> > > > >> >> >> > > >>> >> > > > >> >> Jun >> > > >>> >> > > > >> >> >> > > >>> >> > > > >> >> On Wed, Apr 10, 2024 at 9:13 AM José Armando García >> > > Sancio >> > > >>> >> > > > >> >> <jsan...@confluent.io.invalid> wrote: >> > > >>> >> > > > >> >> >> > > >>> >> > > > >> >> > Hi Justine, >> > > >>> >> > > > >> >> > >> > > >>> >> > > > >> >> > +1 (binding) >> > > >>> >> > > > >> >> > >> > > >>> >> > > > >> >> > Thanks for the improvement. >> > > >>> >> > > > >> >> > -- >> > > >>> >> > > > >> >> > -José >> > > >>> >> > > > >> >> > >> > > >>> >> > > > >> >> >> > > >>> >> > > > >> >> > > >>> >> > > > >> > > >>> >> > >> > > >>> >> > >> > > >>> >> > >> > > >>> >> > -- >> > > >>> >> > -José >> > > >>> >> > >> > > >>> >> >> > > >>> >> > > >> >> > > >> > >>