Huh, I didn't notice it when I grepped the code base. I stand corrected. Jon
On Mon, Dec 9, 2024 at 10:57 AM Ekaterina Dimitrova <e.dimitr...@gmail.com> wrote: > Hey Jon, > The following quick test shows me that vector search is marked as > experimental (it is just not in cassandra.yaml as materialized views, etc) > > cqlsh:k> CREATE TABLE t (pk int, str_val text, val vector<float, 3>, > PRIMARY KEY(pk)); > > cqlsh:k> CREATE CUSTOM INDEX ON t(val) USING 'StorageAttachedIndex'; > > > Warnings : > > SAI ANN indexes on vector columns are experimental and are not recommended > for production use. > > They don't yet support SELECT queries with: > > * Consistency level higher than ONE/LOCAL_ONE. > > * Paging. > > * No LIMIT clauses. > > * PER PARTITION LIMIT clauses. > > * GROUP BY clauses. > > * Aggregation functions. > > * Filters on columns without a SAI index. > > > I do agree that there is differentiation also between experimental and > beta. But I need to think more before expressing concrete > opinion/suggestions here. Though I believe this conversation is healthy to > have and shows the maturity of our project. Thank you, Josh! > > > Best regards, > > Ekaterina > > > On Mon, 9 Dec 2024 at 13:21, Jon Haddad <j...@rustyrazorblade.com> wrote: > >> The tough thing here is that MVs are marked experimental retroactively, >> because by the time the problems were known, there wasn't much anyone could >> do. Experimental was our way of saying "oops, we screwed up, let's put a >> label on it" and the same label got applied to a bunch of new stuff >> including Java 17. They're not even close to being in the same category, >> but they're labeled the same and people treat them as equivalent. >> >> If we knew MVs were so broken before they were merged, they would have >> been -1'ed. Same with incremental repair (till 4.0), and vector search >> today. I would have -1'ed all three of these if it was known how poorly >> they actually performed at the time they were committed. >> >> Side note, vector search isn't marked as experimental today, but it's not >> even usable for non-trivial datasets out of the box, so it should be marked >> as such at this point. >> >> I really wish this stuff was tested at a reasonable scale across various >> failure modes before merging, because the harm it does to the community is >> real. We really shouldn't be put in a position where stuff gets released, >> hyped up, then we find it it's obviously not ready for real world use. I >> built my tooling (tlp-cluster, now easy-cass-lab, and tlp-stress, now >> easy-cass-stress), with this in mind, but sadly I haven't seen much use of >> it it to verify patches. The only reason I found a memory leak in >> CASSANDRA-15452 was because I used these tools on multi-TB datasets over >> several days. >> >> >> Jon >> >> >> On Mon, Dec 9, 2024 at 9:55 AM Slater, Ben via dev < >> dev@cassandra.apache.org> wrote: >> >>> I'm a little worried by the idea of grouping in MVs with things like a >>> Java version under the same "beta" label (acknowledging that they are >>> currently grouped under the same "experimental" label). >>> >>> To me, "beta" implies it's pretty close to production ready and there is >>> an intention to get it to production ready in the near future. I don't >>> think this really describes MVs as I don't see anyone looking like they are >>> trying to get them to really production ready (although I could easily be >>> wrong on that). >>> >>> Maybe there is an argument for "experimental"=this is here to get >>> feedback but there's no commitment it will make it to production ready and >>> "beta"=we think this is done but we'd like to see some production use >>> before declaring it stable. For beta, we'll treat bugs with the same >>> priority as "stable" (or at least close to)? >>> >>> Cheers >>> Ben >>> >>> >>> >>> ------------------------------ >>> *From:* Jon Haddad <j...@rustyrazorblade.com> >>> *Sent:* 09 December 2024 09:43 >>> *To:* dev@cassandra.apache.org <dev@cassandra.apache.org> >>> *Subject:* Re: [DISCUSS] Experimental flagging (fork from Re-evaluate >>> compaction defaults in 5.1/trunk) >>> >>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments * >>> >>> >>> I like this. There's a few things marked as experimental today, so I'll >>> take a stab at making this more concrete, and I think we should be open to >>> graduating certain things out of beta to GA at a faster cycle than a major >>> release. >>> >>> Java versions, for example, should really move out of "beta" quickly. >>> We test against it, and we're not going to drop new versions. So if we're >>> looking at C* 5.0, we should move Java 17 out of experimental / beta >>> immediately and call it GA. >>> >>> SAI and UCS should probably graduate no later than 5.1. >>> >>> On the other hand, MVs have enough warts I actively recommend against >>> using them and should be in beta till we can actually repair them. >>> >>> I don't know if anyone's actually used transient replication and if it's >>> even beta quality... that might actually warrant being called experimental >>> still. >>> >>> 'ALTER ... DROP COMPACT STORAGE' is flagged as experimental. I'm not >>> sure what to do with this. I advise people migrate their data for any >>> Thrift -> CQL cases, mostly because the edge cases are so hard to know in >>> advance, especially since by now these codebases are ancient and the >>> original developers are long gone. >>> >>> Thoughts? >>> >>> Jon >>> >>> >>> >>> >>> On Mon, Dec 9, 2024 at 6:28 AM Josh McKenzie <jmcken...@apache.org> >>> wrote: >>> >>> Jon stated: >>> >>> Side note: I think experimental has been over-used and has lost all >>> meaning. How is Java 17 experimental? Very confusing for the community. >>> >>> >>> Dinesh followed with: >>> >>> Philosophically, as a project, we should wait until critical features >>> like these reach a certain level of maturity prior to recommending it as a >>> default. For me maturity is a function of adoption by diverse use-cases in >>> production and scale. >>> >>> >>> I'd like to discuss 2 ideas related to the above: >>> >>> 1. We rename / alias "experimental" to "beta". It's a word that's >>> ubiquitous in our field and communicates the correct level of expectation >>> to our users (API stable, may have bugs) >>> 2. *All new features* go through one major (either semver MAJOR or >>> MINOR) as "beta" >>> >>> >>> To Jon's point, "experimental" was really a kludge to work around >>> Materialized Views having some very sharp edges that users had to be very >>> aware of. We haven't really used the flagging much (at all?) since then, >>> and we don't have a formalized way to shepherd a new feature through a >>> "soak" period where it can "reach a certain level of maturity". We're >>> caught in a chicken-or-egg scenario with our current need to get a feature >>> released more broadly to have confidence in its stability (to Dinesh's >>> point). >>> >>> In my mind, the following feature evolution would be healthy for us and >>> good for our users: >>> >>> 1. Beta >>> 2. Generally Available >>> 3. Default (where appropriate) >>> >>> To graduate from Beta -> GA, good UX, user facing documentation, a >>> [DISCUSS] thread where we have a clear consensus of readiness, all seem >>> like healthy and good steps. From GA -> Default, [DISCUSS] like we're >>> having re: compaction strategies, unearthing shortcomings, edge-cases, >>> documentation needs, etc. >>> >>> Curious what others think. >>> >>> ~Josh >>> >>>