Re: [DISCUSS] Experimental flagging (fork from Re-evaluate compaction defaults in 5.1/trunk)

Josh McKenzie Tue, 10 Dec 2024 11:10:58 -0800

So some questions to test a world w/3 classifications (Preview, Beta, GA):
- What would we do with the current experimental features (MV's, JDK17, 
witnesses, etc)? Flag them as preview or beta as appropriate on a case-by-case 
basis and add runtime warnings / documentation where missing?


- What would we do in the future if a feature's GA and we discover a Very Big 
Problem with it that'll take some time to fix? Keep it GA but cut a hotfix 
release w/a bunch of warnings? Bounce it back to Preview? Leave it be and just 
feverishly try and fix it?

> for policy decisions like this (that don’t need to be agreed in advance) we 
> should try to legislate the minimum necessary policy to proceed today
Definitely agree; MV's being in limbo for years strains the "3-step 
classification" structure for me. If we want to avoid having a solution for the 
MV-shaped case on the grounds we won't allow ourselves to reach this state 
again in the future, that seems reasonable. With the caveat that we *might* be 
in a similar situation with vector search right now, etc.


On Tue, Dec 10, 2024, at 1:48 PM, Benedict Elliott Smith wrote:
> Yep, I agree with this - we can revisit if we ever absolutely feel the need 
> to add additional states for exceptional circumstances.
> 
> > On 10 Dec 2024, at 13:24, Patrick McFadin <[email protected]> wrote:
> > 
> > -1 on unstable. It's way too many words than are needed. Three is a
> > magic number and fits:
> > 
> > Preview
> > Beta
> > GA
> > 
> > As a matter of testing the process, any pending CEP should go though
> > this exercise so we can see how it will work.
> > 
> > PS
> > Got the actual numbers from Whimsy.
> > DEV - 1425 users
> > USER - 2650
> > 
> > This means that when features experience a state change, finding more
> > avenues to get the word out will be important.
> > 
> > On Tue, Dec 10, 2024 at 10:04 AM Benedict Elliott Smith
> > <[email protected]> wrote:
> >> 
> >> As an aside, it would be nice to admit we basically revisit everything 
> >> each time it becomes relevant again, and for policy decisions like this 
> >> (that don’t need to be agreed in advance) we should try to legislate the 
> >> minimum necessary policy to proceed today, and leave future refinements 
> >> for later when the relevant context arises.
> >> 
> >> On 10 Dec 2024, at 13:00, Benedict Elliott Smith <[email protected]> 
> >> wrote:
> >> 
> >> I agree with Aleksey that if we think something is broken, we shouldn’t 
> >> use euphemisms, and for this reason I don’t like unstable (this could for 
> >> instance simply mean API unstable). If we intend to never need this 
> >> descriptor, we should avoid bike-shedding and insert a “placeholder” for 
> >> now to be refined as and when we need it when we have the necessary future 
> >> context.
> >> 
> >> i.e.
> >> 
> >> preview -> beta -> [“has problems that will take time to resolve 
> >> placeholder” -> beta] -> GA
> >> 
> >> 
> >> 
> >> On 10 Dec 2024, at 12:39, Josh McKenzie <[email protected]> wrote:
> >> 
> >> +1 to this classification with one addition. I think we need to augment 
> >> this with formalization on what we do with features we don't recommend 
> >> people use (i.e. MV in their current incarnation). For something 
> >> retroactively found to be unstable, we could add an "Unstable" 
> >> qualification for it, leaving us with:
> >> 
> >> Unstable: Warnings on use, clearly communicated as to why, either on-track 
> >> to be fixed or removed from the codebase. No lingering for years in a 
> >> fugue state. We should target never needing this classification.
> >> Preview: Ready to be tried by end users but has caveats and most likely is 
> >> not api stable. Developer only documentation acceptable.
> >> Beta: Feature complete/API stable but has not had enough testing to be 
> >> considered rock solid. Developer and User documentation required.
> >> GA: Ready for use, no known issue, PMC is satisfied with the testing that 
> >> has been done
> >> 
> >> 
> >> To walk through how some of the flow might look to test the above:
> >> 
> >> Simple case:
> >> - Preview -> Beta -> GA
> >> 
> >> Late discovered defect case:
> >> - Preview -> Beta -> Unstable -> Beta -> GA
> >> 
> >> Pathological worst-case (i.e. MV):
> >> - Preview -> Beta -> GA -> Unstable -> [Preview|Removed]
> >> 
> >> On Tue, Dec 10, 2024, at 12:29 PM, Jeremiah Jordan wrote:
> >> 
> >> I agree with Aleksey and Patrick.  We should define terminology and then 
> >> stick to it.  My preferred list would be:
> >> 
> >> Preview - Ready to be tried by end users but has caveats and most likely 
> >> is not api stable.
> >> Beta - Feature complete/API stable but has not had enough testing to be 
> >> considered rock solid.
> >> GA - Ready for use, no known issue, PMC is satisfied with the testing that 
> >> has been done
> >> 
> >> 
> >> Whether or not something is enabled by default or the default 
> >> implementation is a separate access from the readiness.  Though if we are 
> >> replacing an existing thing with a new default I would hope we apply extra 
> >> rigor to allowing that to happen.
> >> 
> >> -Jeremiah
> >> 
> >> On Dec 10, 2024 at 11:15:37 AM, Patrick McFadin <[email protected]> wrote:
> >> 
> >> I'm going to try to pull this back from the inevitable bikeshedding
> >> and airing of grievances that happen. Rewind all the way back to
> >> Josh's  original point, which is a defined process. Why I really love
> >> this being brought up is our maturing process of communicating to the
> >> larger user base. The dev list has very few participants. Less than
> >> 1000 last I looked. Most users I talk to just want to know what they
> >> are getting. Well-formed, clear communication is how the PMC can let
> >> end users know that a new feature is one of three states:
> >> 
> >> 1. Beta
> >> 2. Generally Available
> >> 3. Default (where appropriate)
> >> 
> >> Yes! The work is just sorting out what each level means and then
> >> codifying that in confluence. Then, we look at any features that are
> >> under question, assign a level, and determine what it takes to go from
> >> one state to another.
> >> 
> >> The CEPs need to reflect this change. What makes a Beta, GA, Default
> >> for new feature X. It makes it clear for implementers and end users,
> >> which is an important feature of project maturity.
> >> 
> >> Patrick
> >> 
> >> 
> >> 
> >> On Dec 10, 2024 at 5:46:38 AM, Aleksey Yeshchenko <[email protected]> 
> >> wrote:
> >> 
> >> What we’ve done is we’ve overloaded the term ‘experimental’ to mean too 
> >> many related but different ideas. We need additional, more specific 
> >> terminology to disambiguate.
> >> 
> >> 1. Labelling released features that were known to be unstable at release 
> >> as ‘experimental’  retroactively shouldn’t happen and AFAIK only happened 
> >> once, with MVs, and ‘experimental’ there was just a euphemism for 
> >> ‘broken’. Our practices are more mature now, I like to think, that a 
> >> situation like this would not arise in the future - the bar for releasing 
> >> a completed marketable feature is higher. So the label ‘experimental’ 
> >> should not be applied retroactively to anything.
> >> 
> >> 2. It’s possible that a released, once considered production-ready 
> >> feature, might be discovered to be deeply flawed after being released 
> >> already. We need to temporarily mark such a feature as ‘broken' or 
> >> ‘flawed'. Not experimental, and not even ‘unstable’. Make sure we emit a 
> >> warning on its use everywhere, and, if possible, make it opt-in in the 
> >> next major, at the very least, to prevent new uses of it. Announce on dev, 
> >> add a note in NEWS.txt, etc. If the flaws are later addressed, remove the 
> >> label. Removing the feature itself might not be possible, but should be 
> >> considered, with heavy advanced telegraphing to the community.
> >> 
> >> 3. There is probably room for genuine use of ‘experimental’ as a feature 
> >> label. For opt-in features that we commit with an understanding that they 
> >> might not make it at all. Unstable API is implied here, but a feature can 
> >> also have an unstable API without being experimental - so ‘experimental' 
> >> doesn’t equal to ‘api-unstable’. These should not be relied on by any 
> >> production code, they would be heavily gated by unambiguous configuration 
> >> flags, disabled by default, allowed to be removed or changed in any 
> >> version including a minor one.
> >> 
> >> 4. New features without known flaws, intended to be production-ready and 
> >> marketable eventually, that we may want to gain some real-world confidence 
> >> with before we are happy to market or make default. UCS, for example, 
> >> which seems to be in heavy use in Astra and doesn’t have any known open 
> >> issues (AFAIK). It’s not experimental, it’s not unstable, it’s not ‘alpha’ 
> >> or ‘beta’, it just hasn't been widely enough used to have gained a lot of 
> >> confidence. It’s just new. I’m not sure what label even applies here. It’s 
> >> just a regular feature that happens to be new, doesn’t need a label, just 
> >> needs to see some widespread use before we can make it a default. No other 
> >> limitation on its use.
> >> 
> >> 5. Early-integrated, not-yet fully-completed features that are NOT 
> >> experimental in nature. Isolated, gated behind deep configuration flags. 
> >> Have a CEP behind them, we trust that they will be eventually completed, 
> >> but for pragmatic reasons it just made sense to commit them at an earlier 
> >> stage. ‘Preview’, ‘alpha’, ‘beta’ are labels that could apply here 
> >> depending on current feature readiness status. API-instability is implied. 
> >> Once finished they just become a regular new feature, no flag needed, no 
> >> heavy config gating needed.
> >> 
> >> I might be missing some scenarios here.
> >> 
> >> 
> >> 
> 
>

Re: [DISCUSS] Experimental flagging (fork from Re-evaluate compaction defaults in 5.1/trunk)

Reply via email to