Re: [DISCUSS] Experimental flagging (fork from Re-evaluate compaction defaults in 5.1/trunk)

Jon Haddad Mon, 09 Dec 2024 11:35:32 -0800

Huh, I didn't notice it when I grepped the code base.  I stand corrected.

Jon


On Mon, Dec 9, 2024 at 10:57 AM Ekaterina Dimitrova <e.dimitr...@gmail.com>
wrote:

> Hey Jon,
> The following quick test shows me that vector search is marked as
> experimental (it is just not in cassandra.yaml as materialized views, etc)
>
> cqlsh:k> CREATE TABLE t (pk int, str_val text, val vector<float, 3>,
> PRIMARY KEY(pk));
>
> cqlsh:k> CREATE CUSTOM INDEX ON t(val) USING 'StorageAttachedIndex';
>
>
> Warnings :
>
> SAI ANN indexes on vector columns are experimental and are not recommended
> for production use.
>
> They don't yet support SELECT queries with:
>
>  * Consistency level higher than ONE/LOCAL_ONE.
>
>  * Paging.
>
>  * No LIMIT clauses.
>
>  * PER PARTITION LIMIT clauses.
>
>  * GROUP BY clauses.
>
>  * Aggregation functions.
>
>  * Filters on columns without a SAI index.
>
>
> I do agree that there is differentiation also between experimental and
> beta. But I need to think more before expressing concrete
> opinion/suggestions here. Though I believe this conversation is healthy to
> have and shows the maturity of our project. Thank you, Josh!
>
>
> Best regards,
>
> Ekaterina
>
>
> On Mon, 9 Dec 2024 at 13:21, Jon Haddad <j...@rustyrazorblade.com> wrote:
>
>> The tough thing here is that MVs are marked experimental retroactively,
>> because by the time the problems were known, there wasn't much anyone could
>> do.  Experimental was our way of saying "oops, we screwed up, let's put a
>> label on it" and the same label got applied to a bunch of new stuff
>> including Java 17.  They're not even close to being in the same category,
>> but they're labeled the same and people treat them as equivalent.
>>
>> If we knew MVs were so broken before they were merged, they would have
>> been -1'ed.  Same with incremental repair (till 4.0), and vector search
>> today.  I would have -1'ed all three of these if it was known how poorly
>> they actually performed at the time they were committed.
>>
>> Side note, vector search isn't marked as experimental today, but it's not
>> even usable for non-trivial datasets out of the box, so it should be marked
>> as such at this point.
>>
>> I really wish this stuff was tested at a reasonable scale across various
>> failure modes before merging, because the harm it does to the community is
>> real.  We really shouldn't be put in a position where stuff gets released,
>> hyped up, then we find it it's obviously not ready for real world use.  I
>> built my tooling (tlp-cluster, now easy-cass-lab, and tlp-stress, now
>> easy-cass-stress), with this in mind, but sadly I haven't seen much use of
>> it it to verify patches.  The only reason I found a memory leak in
>> CASSANDRA-15452 was because I used these tools on multi-TB datasets over
>> several days.
>>
>>
>> Jon
>>
>>
>> On Mon, Dec 9, 2024 at 9:55 AM Slater, Ben via dev <
>> dev@cassandra.apache.org> wrote:
>>
>>> I'm a little worried by the idea of grouping in MVs with things like a
>>> Java version under the same "beta" label (acknowledging that they are
>>> currently grouped under the same "experimental" label).
>>>
>>> To me, "beta" implies it's pretty close to production ready and there is
>>> an intention to get it to production ready in the near future. I don't
>>> think this really describes MVs as I don't see anyone looking like they are
>>> trying to get them to really production ready (although I could easily be
>>> wrong on that).
>>>
>>> Maybe there is an argument for "experimental"=this is here to get
>>> feedback but there's no commitment it will make it to production ready and
>>> "beta"=we think this is done but we'd like to see some production use
>>> before declaring it stable. For beta, we'll treat bugs with the same
>>> priority as "stable" (or at least close to)?
>>>
>>> Cheers
>>> Ben
>>>
>>>
>>>
>>> ------------------------------
>>> *From:* Jon Haddad <j...@rustyrazorblade.com>
>>> *Sent:* 09 December 2024 09:43
>>> *To:* dev@cassandra.apache.org <dev@cassandra.apache.org>
>>> *Subject:* Re: [DISCUSS] Experimental flagging (fork from Re-evaluate
>>> compaction defaults in 5.1/trunk)
>>>
>>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments *
>>>
>>>
>>> I like this.  There's a few things marked as experimental today, so I'll
>>> take a stab at making this more concrete, and I think we should be open to
>>> graduating certain things out of beta to GA at a faster cycle than a major
>>> release.
>>>
>>> Java versions, for example, should really move out of "beta" quickly.
>>> We test against it, and we're not going to drop new versions.  So if we're
>>> looking at C* 5.0, we should move Java 17 out of experimental / beta
>>> immediately and call it GA.
>>>
>>> SAI and UCS should probably graduate no later than 5.1.
>>>
>>> On the other hand, MVs have enough warts I actively recommend against
>>> using them and should be in beta till we can actually repair them.
>>>
>>> I don't know if anyone's actually used transient replication and if it's
>>> even beta quality... that might actually warrant being called experimental
>>> still.
>>>
>>> 'ALTER ... DROP COMPACT STORAGE' is flagged as experimental.  I'm not
>>> sure what to do with this.  I advise people migrate their data for any
>>> Thrift -> CQL cases, mostly because the edge cases are so hard to know in
>>> advance, especially since by now these codebases are ancient and the
>>> original developers are long gone.
>>>
>>> Thoughts?
>>>
>>> Jon
>>>
>>>
>>>
>>>
>>> On Mon, Dec 9, 2024 at 6:28 AM Josh McKenzie <jmcken...@apache.org>
>>> wrote:
>>>
>>> Jon stated:
>>>
>>> Side note: I think experimental has been over-used and has lost all
>>> meaning.  How is Java 17 experimental?  Very confusing for the community.
>>>
>>>
>>> Dinesh followed with:
>>>
>>> Philosophically, as a project, we should wait until critical features
>>> like these reach a certain level of maturity prior to recommending it as a
>>> default. For me maturity is a function of adoption by diverse use-cases in
>>> production and scale.
>>>
>>>
>>> I'd like to discuss 2 ideas related to the above:
>>>
>>>    1. We rename / alias "experimental" to "beta". It's a word that's
>>>    ubiquitous in our field and communicates the correct level of expectation
>>>    to our users (API stable, may have bugs)
>>>    2. *All new features* go through one major (either semver MAJOR or
>>>    MINOR) as "beta"
>>>
>>>
>>> To Jon's point, "experimental" was really a kludge to work around
>>> Materialized Views having some very sharp edges that users had to be very
>>> aware of. We haven't really used the flagging much (at all?) since then,
>>> and we don't have a formalized way to shepherd a new feature through a
>>> "soak" period where it can "reach a certain level of maturity". We're
>>> caught in a chicken-or-egg scenario with our current need to get a feature
>>> released more broadly to have confidence in its stability (to Dinesh's
>>> point).
>>>
>>> In my mind, the following feature evolution would be healthy for us and
>>> good for our users:
>>>
>>>    1. Beta
>>>    2. Generally Available
>>>    3. Default (where appropriate)
>>>
>>> To graduate from Beta -> GA, good UX, user facing documentation, a
>>> [DISCUSS] thread where we have a clear consensus of readiness, all seem
>>> like healthy and good steps. From GA -> Default, [DISCUSS] like we're
>>> having re: compaction strategies, unearthing shortcomings, edge-cases,
>>> documentation needs, etc.
>>>
>>> Curious what others think.
>>>
>>> ~Josh
>>>
>>>

Re: [DISCUSS] Experimental flagging (fork from Re-evaluate compaction defaults in 5.1/trunk)

Reply via email to