There will be a LOT of content around using SAI in 5.0. CCing marketing ML
On Wed, May 10, 2023 at 8:38 PM Jeff Jirsa <jji...@gmail.com> wrote: > Changes like this always scare me, but the benefits probably outweigh the > risks. Probably obviously to whoever implements but please make sure if > this happens is super visible in both NEWS and simultaneously updates the > to-string / to-cql representation of the schema in cqlsh / drivers / > snapshots > > On Wed, May 10, 2023 at 8:27 PM Patrick McFadin <pmcfa...@gmail.com> > wrote: > >> Having pulled a lot of developers out of the 2i fire, I would love it if >> defaults got a bit more sane. Adding USING...WITH... on CREATE INDEX >> seems like the right move for most developers that don't read docs and >> assume behavior. >> >> As much as I hate that 2i would be the configured default, I get it. New >> feature and this is the right thing for users. Would there be any way to >> switch 2i to SAI for the same index declaration? That would make for a nice >> upgrade for users moving to 5 without having to re-create indexes. >> >> Patrick >> >> On Wed, May 10, 2023 at 9:28 AM David Capwell <dcapw...@apple.com> wrote: >> >>> Having to revert to CREATE CUSTOM INDEX sounds pretty awful, so I'd >>> prefer allowing USING...WITH... for CREATE INDEX >>> >>> >>> I have 0 issues with a new syntax to make this more clear >>> >>> just deprecating CREATE CUSTOM INDEX (at least after 5.0), but that's >>> more or less what my original proposal was above (modulo the configurable >>> default). >>> >>> >>> I have 0 issues deprecating and producing a ClientWarning recommending >>> the new syntax, but I would be against removing this syntax later on… it >>> should be low effort to keep, so breaking a user would not be desirable for >>> me. >>> >>> change only the fact that CREATE INDEX retains a configurable default >>> >>> >>> This option allows users to control this behavior, and allows us to >>> change the default over time. For 5.0 I am strongly against SAI being the >>> default (new features disabled by default), but I wouldn’t have issues in >>> later versions changing the default once its been out for awhile. >>> >>> I’m not convinced by the changing defaults argument here. The >>> characteristics of the two index types are very different, and users with >>> scripts that make indexes today shouldn’t have their behaviour change. >>> >>> >>> In my mind this is no different from defaulting to BTI in a follow up >>> release, but if this concern is that the legacy index leaked details such >>> as index tables, so changing the default would have side effects in the >>> public domain that users might not expect, then I get it… are there other >>> concerns? >>> >>> On May 10, 2023, at 9:03 AM, Caleb Rackliffe <calebrackli...@gmail.com> >>> wrote: >>> >>> tl;dr If you take my original proposal and change only the fact that CREATE >>> INDEX retains a configurable default, I think we get to the same place? >>> >>> (Then it's just a matter of what we do in 5.0 vs. after 5.0...) >>> >>> On Wed, May 10, 2023 at 11:00 AM Caleb Rackliffe < >>> calebrackli...@gmail.com> wrote: >>> >>>> I see a broad desire here to have a configurable (YAML) default >>>> implementation for CREATE INDEX. I'm not strongly opposed to that, as >>>> the concept of a default index implementation is pretty standard for most >>>> DBMS (see Postgres, etc.). However, keep in mind that if we do that, we >>>> still need to either revert to CREATE CUSTOM INDEX or add the >>>> USING...WITH... extensions to CREATE INDEX to override the default or >>>> specify parameters, which will be in play once SAI supports basic text >>>> tokenization/filtering. Having to revert to CREATE CUSTOM INDEX sounds >>>> pretty awful, so I'd prefer allowing USING...WITH... for CREATE INDEX >>>> and just deprecating CREATE CUSTOM INDEX (at least after 5.0), but >>>> that's more or less what my original proposal was above (modulo the >>>> configurable default). >>>> >>>> Thoughts? >>>> >>>> On Wed, May 10, 2023 at 2:59 AM Benedict <bened...@apache.org> wrote: >>>> >>>>> I’m not convinced by the changing defaults argument here. The >>>>> characteristics of the two index types are very different, and users with >>>>> scripts that make indexes today shouldn’t have their behaviour change. >>>>> >>>>> We could introduce new syntax that properly appreciates there’s no >>>>> default index, perhaps CREATE LOCAL [type] INDEX? To also make clear that >>>>> these indexes involve a partition key or scatter gather >>>>> >>>>> On 10 May 2023, at 06:26, guo Maxwell <cclive1...@gmail.com> wrote: >>>>> >>>>> >>>>> +1 , as we must Improve the image of your own default indexing ability. >>>>> >>>>> and As for *CREATE CUSTOM INDEX *, should we just left as it is and >>>>> we can disable the ability for create SAI through *CREATE CUSTOM >>>>> INDEX* in some version after 5.0? >>>>> >>>>> for as I know there may be users using this as a plugin-index >>>>> interface, like https://github.com/Stratio/cassandra-lucene-index >>>>> (though these project may be inactive, But if someone wants to do >>>>> something >>>>> similar in the future, we don't have to stop). >>>>> >>>>> >>>>> >>>>> Jonathan Ellis <jbel...@gmail.com> 于2023年5月10日周三 10:01写道: >>>>> >>>>>> +1 for this, especially in the long term. CREATE INDEX should do the >>>>>> right thing for most people without requiring extra ceremony. >>>>>> >>>>>> On Tue, May 9, 2023 at 5:20 PM Jeremiah D Jordan < >>>>>> jeremiah.jor...@gmail.com> wrote: >>>>>> >>>>>>> If the consensus is that SAI is the right default index, then we >>>>>>> should just change CREATE INDEX to be SAI, and legacy 2i to be a CUSTOM >>>>>>> INDEX. >>>>>>> >>>>>>> >>>>>>> On May 9, 2023, at 4:44 PM, Caleb Rackliffe < >>>>>>> calebrackli...@gmail.com> wrote: >>>>>>> >>>>>>> Earlier today, Mick started a thread on the future of our index >>>>>>> creation DDL on Slack: >>>>>>> >>>>>>> https://the-asf.slack.com/archives/C018YGVCHMZ/p1683527794220019 >>>>>>> >>>>>>> At the moment, there are two ways to create a secondary index. >>>>>>> >>>>>>> *1.) CREATE INDEX [IF NOT EXISTS] [name] ON <table> (<column>)* >>>>>>> >>>>>>> This creates an optionally named legacy 2i on the provided table and >>>>>>> column. >>>>>>> >>>>>>> ex. CREATE INDEX my_index ON kd.tbl(my_text_col) >>>>>>> >>>>>>> *2.) CREATE CUSTOM INDEX [IF NOT EXISTS] [name] ON <table> >>>>>>> (<column>) USING <class|alias> [WITH OPTIONS = <options>]* >>>>>>> >>>>>>> This creates a secondary index on the provided table and column >>>>>>> using the specified 2i implementation class and (optional) parameters. >>>>>>> >>>>>>> ex. CREATE CUSTOM INDEX my_index ON ks.tbl(my_text_col) USING >>>>>>> 'StorageAttachedIndex' >>>>>>> >>>>>>> (Note that the work on SAI added aliasing, so `StorageAttachedIndex` >>>>>>> is shorthand for the fully-qualified class name, which is also valid.) >>>>>>> >>>>>>> So what is there to discuss? >>>>>>> >>>>>>> The concern Mick raised is... >>>>>>> >>>>>>> "...just folk continuing to use CREATE INDEX because they think CREATE >>>>>>> CUSTOM INDEX is advanced (or just don't know of it), and we leave >>>>>>> users doing 2i (when they think they are, and/or we definitely want >>>>>>> them to >>>>>>> be, using SAI)" >>>>>>> >>>>>>> To paraphrase, we want people to use SAI once it's available where >>>>>>> possible, and the default behavior of CREATE INDEX could be at odds >>>>>>> w/ that. >>>>>>> >>>>>>> The proposal we seem to have landed on is something like the >>>>>>> following: >>>>>>> >>>>>>> For 5.0: >>>>>>> >>>>>>> 1.) Disable by default the creation of new legacy 2i via CREATE >>>>>>> INDEX. >>>>>>> 2.) Leave CREATE CUSTOM INDEX...USING... available by default. >>>>>>> >>>>>>> (Note: How this would interact w/ the existing >>>>>>> secondary_indexes_enabled YAML options isn't clear yet.) >>>>>>> >>>>>>> Post-5.0: >>>>>>> >>>>>>> 1.) Deprecate and eventually remove SASI when SAI hits full feature >>>>>>> parity w/ it. >>>>>>> 2.) Replace both CREATE INDEX and CREATE CUSTOM INDEX w/ something >>>>>>> of a hybrid between the two. For example, CREATE >>>>>>> INDEX...USING...WITH. This would both be flexible enough to >>>>>>> accommodate index implementation selection and prescriptive enough to >>>>>>> force >>>>>>> the user to make a decision (and wouldn't change the legacy behavior of >>>>>>> the >>>>>>> existing CREATE INDEX). In this world, creating a legacy 2i might >>>>>>> look something like CREATE INDEX...USING `legacy`. >>>>>>> 3.) Eventually deprecate CREATE CUSTOM INDEX...USING. >>>>>>> >>>>>>> Eventually we would have a single enabled DDL statement for index >>>>>>> creation that would be minimal but also explicit/able to handle some >>>>>>> evolution. >>>>>>> >>>>>>> What does everyone think? >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Jonathan Ellis >>>>>> co-founder, http://www.datastax.com >>>>>> @spyced >>>>>> >>>>> >>>>> >>>>> -- >>>>> you are the apple of my eye ! >>>>> >>>>> >>>