I also agree with 2 & 2 with reasoning along the same lines as Artem. Thanks, Andrew
> On 12 Jan 2024, at 09:15, Federico Valeri <fedeval...@gmail.com> wrote: > > On Thu, Jan 11, 2024 at 10:43 PM Artem Livshits > <alivsh...@confluent.io.invalid> wrote: >> >> Hi Proven, >> >> I'd say that we should do 2 & 2. The idea is that for small features that >> can be done and stabilized within a short period of time (with one or very >> few commits) that's exactly what happens -- people interested in testing >> in-progress feature could take unstable code from a patch (or private >> branch / fork) with the expectation that that private code could create a >> state that will not be compatible with anything (or may be completely >> broken for that matter -- in the end of the day it's a functionality that >> may not be fully tested or even fully implemented); and once the feature is >> stable it goes to trunk it is fully committed there, if the bugs are found >> they'd get fixed "forward". > > I agree with this reasoning. > >> The 2 & 2 option pretty much extends this to >> large features -- if a feature is above stable MV, then going above it is >> like getting some in-progress code for early testing with the expectation >> that something may not fully work or leave system in upgradable state; > > Usually I expect that an early access feature may not fully work, but > not that it could affect upgrades. I think this is less obvious, > that's why I asked to document clearly. > >> promoting a feature into a state MV would come with the expectation that >> the feature gets fully committed and any bugs will be fixed "forward". >> >> -Artem >> >> On Thu, Jan 11, 2024 at 10:16 AM Proven Provenzano >> <pprovenz...@confluent.io.invalid> wrote: >> >>> We have two approaches here for how we update unstable metadata versions. >>> >>> 1. The update will only increase MVs of unstable features to a value >>> greater than the new stable feature. The idea is that a specific >>> unstable >>> MV may support some set of features and in the future that set is >>> always a >>> strict subset of the current set. The issue is that moving a feature to >>> make way for a stable feature with a higher MV will leave holes. >>> 2. We are free to reorder the MV for any unstable feature. This removes >>> the hole issue, but does make the unstable MVs more muddled. There isn't >>> the same binary state for a MV where a feature is available or there is >>> a >>> hole. >>> >>> >>> We also have two ends of the spectrum as to when we update the stable MV. >>> >>> 1. We update at release points which reduces the amount of churn of the >>> unstable MVs and makes a stronger correlation between accepted features >>> and >>> stable MVs for a release but means less testing on trunk as a stable MV. >>> 2. We update when the developers of a feature think it is done. This >>> leads to features being available for more testing in trunk but forces >>> the >>> next release to include it as stable. >>> >>> >>> I'd like more feedback from others on these two dimensions. >>> --Proven >>> >>> >>> >>> On Wed, Jan 10, 2024 at 12:16 PM Justine Olshan >>> <jols...@confluent.io.invalid> wrote: >>> >>>> Hmm it seems like Colin and Proven are disagreeing with whether we can >>> swap >>>> unstable metadata versions. >>>> >>>>> When we reorder, we are always allocating a new MV and we are never >>>> reusing an existing MV even if it was also unstable. >>>> >>>>> Given that this is true, there's no reason to have special rules about >>>> what we can and can't do with unstable MVs. We can do anything >>>> >>>> I don't have a strong preference either way, but I think we should agree >>> on >>>> one approach. >>>> The benefit of reordering and reusing is that we can release features >>> that >>>> are ready earlier and we have more flexibility. With the approach where >>> we >>>> always create a new MV, I am concerned with having many "empty" MVs. This >>>> would encourage waiting until the release before we decide an incomplete >>>> feature is not ready and moving its MV into the future. (The >>>> abandoning comment I made earlier -- that is consistent with Proven's >>>> approach) >>>> >>>> I think the only potential issue with reordering is that it could be a >>> bit >>>> confusing and *potentially *prone to errors. Note I say potentially >>> because >>>> I think it depends on folks' understanding with this new unstable >>> metadata >>>> version concept. I echo Federico's comments about making sure the risks >>> are >>>> highlighted. >>>> >>>> Thanks, >>>> >>>> Justine >>>> >>>> On Wed, Jan 10, 2024 at 1:16 AM Federico Valeri <fedeval...@gmail.com> >>>> wrote: >>>> >>>>> Hi folks, >>>>> >>>>>> If you use an unstable MV, you probably won't be able to upgrade your >>>>> software. Because whenever something changes, you'll probably get >>>>> serialization exceptions being thrown inside the controller. Fatal >>> ones. >>>>> >>>>> Thanks for this clarification. I think this concrete risk should be >>>>> highlighted in the KIP and in the "unstable.metadata.versions.enable" >>>>> documentation. >>>>> >>>>> In the test plan, should we also have one system test checking that >>>>> "features with a stable MV will never have that MV changed"? >>>>> >>>>> On Wed, Jan 10, 2024 at 8:16 AM Colin McCabe <cmcc...@apache.org> >>> wrote: >>>>>> >>>>>> On Tue, Jan 9, 2024, at 18:56, Proven Provenzano wrote: >>>>>>> Hi folks, >>>>>>> >>>>>>> Thank you for the questions. >>>>>>> >>>>>>> Let me clarify about reorder first. The reorder of unstable >>> metadata >>>>>>> versions should be infrequent. >>>>>> >>>>>> Why does it need to be infrequent? We should be able to reorder >>>> unstable >>>>> metadata versions as often as we like. There are no guarantees about >>>>> unstable MVs. >>>>>> >>>>>>> The time you reorder is when a feature that >>>>>>> requires a higher metadata version to enable becomes "production >>>>> ready" and >>>>>>> the features with unstable metadata versions less than the new >>> stable >>>>> one >>>>>>> are moved to metadata versions greater than the new stable feature. >>>>> When we >>>>>>> reorder, we are always allocating a new MV and we are never reusing >>>> an >>>>>>> existing MV even if it was also unstable. This way a developer >>>>> upgrading >>>>>>> their environment with a specific unstable MV might see existing >>>>>>> functionality stop working but they won't see new MV dependent >>>>>>> functionality magically appear. The feature set for a given >>> unstable >>>> MV >>>>>>> version can only decrease with reordering. >>>>>> >>>>>> If you use an unstable MV, you probably won't be able to upgrade your >>>>> software. Because whenever something changes, you'll probably get >>>>> serialization exceptions being thrown inside the controller. Fatal >>> ones. >>>>>> >>>>>> Given that this is true, there's no reason to have special rules >>> about >>>>> what we can and can't do with unstable MVs. We can do anything. >>>>>> >>>>>>> >>>>>>> How do we define "production ready" and when should we bump >>>>>>> LATEST_PRODUCTION? I would like to define it to be the point where >>>> the >>>>>>> feature is code complete with tests and the KIP for it is approved. >>>>> However >>>>>>> even with this definition if the feature later develops a major >>> issue >>>>> it >>>>>>> could still block future features until the issue is fixed which is >>>>> what we >>>>>>> are trying to avoid here. We could be much more formal about this >>> and >>>>> let >>>>>>> the release manager for a release define what is stable for a given >>>>> release >>>>>>> and then do the bump just after the branch is created on the >>> branch. >>>>> When >>>>>>> an RC candidate is accepted, the bump would be backported. I would >>>>> like to >>>>>>> hear other ideas here. >>>>>>> >>>>>> >>>>>> Yeah, it's an interesting question. Overall, I think developers >>> should >>>>> define when a feature is production ready. >>>>>> >>>>>> The question to ask is, "are you ready to take this feature to >>>>> production in your workplace?" I think most developers do have a sense >>> of >>>>> this. Obviously bugs and mistakes can happen, but I think this standard >>>>> would avoid most of the issues that we're trying to avoid by having >>>>> unstable MVs in the first place. >>>>>> >>>>>> ELR is a good example. Nobody would have said that it was production >>>>> ready in 3.7 ... hence it belonged (and still belongs) in an unstable >>> MV, >>>>> until that changes (hopefully soon :) ) >>>>>> >>>>>> best, >>>>>> Colin >>>>>> >>>>>>> --Proven >>>>>>> >>>>>>> On Tue, Jan 9, 2024 at 3:26 PM Colin McCabe <cmcc...@apache.org> >>>>> wrote: >>>>>>> >>>>>>>> Hi Justine, >>>>>>>> >>>>>>>> Yes, this is an important point to clarify. Proven can comment >>> more, >>>>> but >>>>>>>> my understanding is that we can do anything to unstable metadata >>>>> versions. >>>>>>>> Reorder them, delete them, change them in any other way. There are >>>> no >>>>>>>> stability guarantees. If the current text is unclear let's add >>> more >>>>>>>> examples of what we can do (which is anything) :) >>>>>>>> >>>>>>>> best, >>>>>>>> Colin >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Jan 8, 2024, at 14:18, Justine Olshan wrote: >>>>>>>>> Hey Colin, >>>>>>>>> >>>>>>>>> I had some offline discussions with Proven previously and it >>> seems >>>>> like >>>>>>>> he >>>>>>>>> said something different so I'm glad I brought it up here. >>>>>>>>> >>>>>>>>> Let's clarify if we are ok with reordering unstable metadata >>>>> versions :) >>>>>>>>> >>>>>>>>> Justine >>>>>>>>> >>>>>>>>> On Mon, Jan 8, 2024 at 1:56 PM Colin McCabe <cmcc...@apache.org >>>> >>>>> wrote: >>>>>>>>> >>>>>>>>>> On Mon, Jan 8, 2024, at 13:19, Justine Olshan wrote: >>>>>>>>>>> Hey all, >>>>>>>>>>> >>>>>>>>>>> I was wondering how often we plan to update LATEST_PRODUCTION >>>>> metadata >>>>>>>>>>> version. Is this something we should do as soon as the >>> feature >>>> is >>>>>>>>>> complete >>>>>>>>>>> or something we do when we are releasing kafka. When is the >>>> time >>>>> we >>>>>>>>>> abandon >>>>>>>>>>> a MV so that other features can be unblocked? >>>>>>>>>> >>>>>>>>>> Hi Justine, >>>>>>>>>> >>>>>>>>>> Thanks for reviewing. >>>>>>>>>> >>>>>>>>>> The idea is that you should bump LATEST_PRODUCTION when you >>> want >>>> to >>>>>>>> take a >>>>>>>>>> feature to production. That could mean deploying it internally >>>>>>>> somewhere to >>>>>>>>>> production, or doing an Apache release that lets everyone >>> deploy >>>>> the >>>>>>>> thing >>>>>>>>>> to production. >>>>>>>>>> >>>>>>>>>> Not in production? No need to care about this. Make any changes >>>> you >>>>>>>> like. >>>>>>>>>> >>>>>>>>>> As a corollary, we should keep the LATEST_PRODUCTION version as >>>>> low as >>>>>>>> it >>>>>>>>>> can be. If you haven't tested the feature, don't freeze it in >>>>> stone yet. >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I am just considering a feature that may end up missing a >>>>> release. It >>>>>>>>>> seems >>>>>>>>>>> like maybe that MV would block future metadata versions until >>>> we >>>>>>>> decide >>>>>>>>>> the >>>>>>>>>>> feature won't make the cut. From that point, all "ready" >>>> features >>>>>>>> should >>>>>>>>>> be >>>>>>>>>>> able to be released. >>>>>>>>>> >>>>>>>>>> The intention is the opposite. A feature in an unstable >>> metadata >>>>> version >>>>>>>>>> doesn't block anything. You can always move a feature from one >>>>> unstable >>>>>>>>>> metadata version to another if the feature starts taking too >>> long >>>>> to >>>>>>>> finish. >>>>>>>>>> >>>>>>>>>>> I'm also wondering if the KIP should include some information >>>>> about >>>>>>>> how a >>>>>>>>>>> metadata should be abandoned. Maybe there is a specific >>> message >>>>> to >>>>>>>> write >>>>>>>>>> in >>>>>>>>>>> the file? So folks who were maybe waiting on that version >>> know >>>>> they >>>>>>>> can >>>>>>>>>>> release their feature? >>>>>>>>>>> >>>>>>>>>>> I am also assuming that we don't shift all the waiting >>> metadata >>>>>>>> versions >>>>>>>>>>> when we abandon a version, but it would be good to clarify >>> and >>>>>>>> include in >>>>>>>>>>> the KIP. >>>>>>>>>> >>>>>>>>>> I'm not sure what you mean by abandoning a version. We never >>>>> abandon a >>>>>>>>>> version once it's stable. >>>>>>>>>> >>>>>>>>>> Unstable versions can change. I wouldn't describe this as >>>>> "abandonment", >>>>>>>>>> just the MV changing prior to release. >>>>>>>>>> >>>>>>>>>> In a similar way, the contents of the 3.7 branch will change up >>>>> until >>>>>>>>>> 3.7.0 is released. Once it gets released, it's never >>> unreleased. >>>>> We just >>>>>>>>>> move on to 3.7.1. Same thing here. >>>>>>>>>> >>>>>>>>>> best, >>>>>>>>>> Colin >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> Justine >>>>>>>>>>> >>>>>>>>>>> On Mon, Jan 8, 2024 at 12:44 PM Colin McCabe < >>>> cmcc...@apache.org >>>>>> >>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Proven, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for the KIP. I think there is a need for this >>>>> capability, for >>>>>>>>>> those >>>>>>>>>>>> of us who deploy from trunk (or branches dervied from >>> trunk). >>>>>>>>>>>> >>>>>>>>>>>> With regard to "unstable.metadata.versions.enable": is this >>>>> going to >>>>>>>> be >>>>>>>>>> a >>>>>>>>>>>> documented configuration, or an internal one? I am guessing >>> we >>>>> want >>>>>>>> it >>>>>>>>>> to >>>>>>>>>>>> be documented, so that users can use it. If we do, we should >>>>> probably >>>>>>>>>> also >>>>>>>>>>>> very prominently warn that THIS WILL BREAK UPGRADES FOR YOUR >>>>> CLUSTER. >>>>>>>>>> That >>>>>>>>>>>> includes logging an ERROR message on startup, etc. >>>>>>>>>>>> >>>>>>>>>>>> It would be good to document if a release can go out that >>>>> contains >>>>>>>>>> "future >>>>>>>>>>>> MVs" that are unstable. Like can we make a 3.8 release that >>>>> contains >>>>>>>>>>>> IBP_4_0_IV0 in MetadataVersion.java, as an unstable future >>> MV? >>>>>>>>>> Personally I >>>>>>>>>>>> think the answer should be "yes," but with the usual >>> caveats. >>>>> When >>>>>>>> the >>>>>>>>>>>> actual 4.0 comes out, the unstable 4.0 MV that shipped in >>> 3.8 >>>>>>>> probably >>>>>>>>>>>> won't work, and you won't be able to upgrade. (It was >>>> unstable, >>>>> we >>>>>>>> told >>>>>>>>>> you >>>>>>>>>>>> not to use it.) >>>>>>>>>>>> >>>>>>>>>>>> best, >>>>>>>>>>>> Colin >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jan 5, 2024, at 07:32, Proven Provenzano wrote: >>>>>>>>>>>>> Hey folks, >>>>>>>>>>>>> >>>>>>>>>>>>> I am starting a discussion thread for managing unstable >>>>> metadata >>>>>>>>>>>>> versions >>>>>>>>>>>>> in Apache Kafka. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>> >>>> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1014%3A+Managing+Unstable+Metadata+Versions+in+Apache+Kafka >>>>>>>>>>>>> >>>>>>>>>>>>> This KIP is actually already implemented in 3.7 with PR >>>>>>>>>>>>> https://github.com/apache/kafka/pull/14860. >>>>>>>>>>>>> I have created this KIP to explain the motivation and how >>>>> managing >>>>>>>>>>>> Metadata >>>>>>>>>>>>> Versions is expected to work. >>>>>>>>>>>>> Comments are greatly appreciated as this process can >>> always >>>> be >>>>>>>>>> improved. >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> --Proven