Hey Jon,
The following quick test shows me that vector search is marked as
experimental (it is just not in cassandra.yaml as materialized views, etc)

cqlsh:k> CREATE TABLE t (pk int, str_val text, val vector<float, 3>,
PRIMARY KEY(pk));

cqlsh:k> CREATE CUSTOM INDEX ON t(val) USING 'StorageAttachedIndex';


Warnings :

SAI ANN indexes on vector columns are experimental and are not recommended
for production use.

They don't yet support SELECT queries with:

 * Consistency level higher than ONE/LOCAL_ONE.

 * Paging.

 * No LIMIT clauses.

 * PER PARTITION LIMIT clauses.

 * GROUP BY clauses.

 * Aggregation functions.

 * Filters on columns without a SAI index.


I do agree that there is differentiation also between experimental and
beta. But I need to think more before expressing concrete
opinion/suggestions here. Though I believe this conversation is healthy to
have and shows the maturity of our project. Thank you, Josh!


Best regards,

Ekaterina


On Mon, 9 Dec 2024 at 13:21, Jon Haddad <j...@rustyrazorblade.com> wrote:

> The tough thing here is that MVs are marked experimental retroactively,
> because by the time the problems were known, there wasn't much anyone could
> do.  Experimental was our way of saying "oops, we screwed up, let's put a
> label on it" and the same label got applied to a bunch of new stuff
> including Java 17.  They're not even close to being in the same category,
> but they're labeled the same and people treat them as equivalent.
>
> If we knew MVs were so broken before they were merged, they would have
> been -1'ed.  Same with incremental repair (till 4.0), and vector search
> today.  I would have -1'ed all three of these if it was known how poorly
> they actually performed at the time they were committed.
>
> Side note, vector search isn't marked as experimental today, but it's not
> even usable for non-trivial datasets out of the box, so it should be marked
> as such at this point.
>
> I really wish this stuff was tested at a reasonable scale across various
> failure modes before merging, because the harm it does to the community is
> real.  We really shouldn't be put in a position where stuff gets released,
> hyped up, then we find it it's obviously not ready for real world use.  I
> built my tooling (tlp-cluster, now easy-cass-lab, and tlp-stress, now
> easy-cass-stress), with this in mind, but sadly I haven't seen much use of
> it it to verify patches.  The only reason I found a memory leak in
> CASSANDRA-15452 was because I used these tools on multi-TB datasets over
> several days.
>
>
> Jon
>
>
> On Mon, Dec 9, 2024 at 9:55 AM Slater, Ben via dev <
> dev@cassandra.apache.org> wrote:
>
>> I'm a little worried by the idea of grouping in MVs with things like a
>> Java version under the same "beta" label (acknowledging that they are
>> currently grouped under the same "experimental" label).
>>
>> To me, "beta" implies it's pretty close to production ready and there is
>> an intention to get it to production ready in the near future. I don't
>> think this really describes MVs as I don't see anyone looking like they are
>> trying to get them to really production ready (although I could easily be
>> wrong on that).
>>
>> Maybe there is an argument for "experimental"=this is here to get
>> feedback but there's no commitment it will make it to production ready and
>> "beta"=we think this is done but we'd like to see some production use
>> before declaring it stable. For beta, we'll treat bugs with the same
>> priority as "stable" (or at least close to)?
>>
>> Cheers
>> Ben
>>
>>
>>
>> ------------------------------
>> *From:* Jon Haddad <j...@rustyrazorblade.com>
>> *Sent:* 09 December 2024 09:43
>> *To:* dev@cassandra.apache.org <dev@cassandra.apache.org>
>> *Subject:* Re: [DISCUSS] Experimental flagging (fork from Re-evaluate
>> compaction defaults in 5.1/trunk)
>>
>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments *
>>
>>
>> I like this.  There's a few things marked as experimental today, so I'll
>> take a stab at making this more concrete, and I think we should be open to
>> graduating certain things out of beta to GA at a faster cycle than a major
>> release.
>>
>> Java versions, for example, should really move out of "beta" quickly.  We
>> test against it, and we're not going to drop new versions.  So if we're
>> looking at C* 5.0, we should move Java 17 out of experimental / beta
>> immediately and call it GA.
>>
>> SAI and UCS should probably graduate no later than 5.1.
>>
>> On the other hand, MVs have enough warts I actively recommend against
>> using them and should be in beta till we can actually repair them.
>>
>> I don't know if anyone's actually used transient replication and if it's
>> even beta quality... that might actually warrant being called experimental
>> still.
>>
>> 'ALTER ... DROP COMPACT STORAGE' is flagged as experimental.  I'm not
>> sure what to do with this.  I advise people migrate their data for any
>> Thrift -> CQL cases, mostly because the edge cases are so hard to know in
>> advance, especially since by now these codebases are ancient and the
>> original developers are long gone.
>>
>> Thoughts?
>>
>> Jon
>>
>>
>>
>>
>> On Mon, Dec 9, 2024 at 6:28 AM Josh McKenzie <jmcken...@apache.org>
>> wrote:
>>
>> Jon stated:
>>
>> Side note: I think experimental has been over-used and has lost all
>> meaning.  How is Java 17 experimental?  Very confusing for the community.
>>
>>
>> Dinesh followed with:
>>
>> Philosophically, as a project, we should wait until critical features
>> like these reach a certain level of maturity prior to recommending it as a
>> default. For me maturity is a function of adoption by diverse use-cases in
>> production and scale.
>>
>>
>> I'd like to discuss 2 ideas related to the above:
>>
>>    1. We rename / alias "experimental" to "beta". It's a word that's
>>    ubiquitous in our field and communicates the correct level of expectation
>>    to our users (API stable, may have bugs)
>>    2. *All new features* go through one major (either semver MAJOR or
>>    MINOR) as "beta"
>>
>>
>> To Jon's point, "experimental" was really a kludge to work around
>> Materialized Views having some very sharp edges that users had to be very
>> aware of. We haven't really used the flagging much (at all?) since then,
>> and we don't have a formalized way to shepherd a new feature through a
>> "soak" period where it can "reach a certain level of maturity". We're
>> caught in a chicken-or-egg scenario with our current need to get a feature
>> released more broadly to have confidence in its stability (to Dinesh's
>> point).
>>
>> In my mind, the following feature evolution would be healthy for us and
>> good for our users:
>>
>>    1. Beta
>>    2. Generally Available
>>    3. Default (where appropriate)
>>
>> To graduate from Beta -> GA, good UX, user facing documentation, a
>> [DISCUSS] thread where we have a clear consensus of readiness, all seem
>> like healthy and good steps. From GA -> Default, [DISCUSS] like we're
>> having re: compaction strategies, unearthing shortcomings, edge-cases,
>> documentation needs, etc.
>>
>> Curious what others think.
>>
>> ~Josh
>>
>>

Reply via email to