Agreed there’s no reason to pull it out. I was just wondering what state it was in, given I didn’t see it mentioned in the CEP.
> On Feb 14, 2022, at 8:12 AM, Mike Adamson <madam...@datastax.com> wrote: > > > We don't need a whole "codec framework" for V1, but we're still embedding > some versioning information in the column index on-disk structures, right? > > I’m not sure why we would want to pull the versioning code only to have to > put it back in as soon as we need to change the on-disk format. We also need > to consider whether the legacy format used by DSE is supported in OSS. I’m > not sure of the policy on this although I strongly suspect that the answer is > that it won’t be supported. Either way, it would seem to be a lot of work to > pull the versioning code out at this point since it formed part of a major > refactor of the SAI framework and plumbing. > > MikeA > >> On 11 Feb 2022, at 18:47, Caleb Rackliffe <calebrackli...@gmail.com> wrote: >> >> Just finished reading the latest version of the CEP. Here are my thoughts: >> >> - We've already talked about OR queries, so I won't rehash that, but >> tokenization support seems like it might be another one of those places >> where we can cut scope if we want to get V1 out the door. It shouldn't be >> that hard to detangle from the rest of the code. >> - We mention the JMX metric ecosystem in the CEP, but not the related >> virtual tables. This isn't a big issue, and doesn't mean we need to change >> the CEP, but it might be helpful for those not familiar with the existing >> prototype to know they exist :) >> - It's probably below the line for CEP discussion, but the text and numeric >> index formats will probably change over time. We don't need a whole "codec >> framework" for V1, but we're still embedding some versioning information in >> the column index on-disk structures, right? >> >> To offset my obvious partiality around this CEP, I've already made an effort >> to raise some of the issues that may come up to challenge us from a macro >> perspective. It seems like the prevailing opinion here is that they are >> either surmountable or simply basic conceptual difficulties w/ distributed >> secondary indexing. >> >> tl;dr I'm +1 on bringing this to a vote and starting to put together all the >> pieces for CASSANDRA-16052 :) >> >>> On Thu, Feb 10, 2022 at 11:26 AM Mike Adamson <madam...@datastax.com> wrote: >>> > I'd be interested to hear from Mike/Jason on the OR support topic, of >>> > course. >>> >>> The support for OR within SAI is fairly minimal and will not work without >>> the non-SAI changes needed. Since the non-SAI OR changes are extensive it >>> would be better to bring those in under their own CEP. >>> >>> I’d leave the decision of whether to put the rest of SAI behind an >>> experimental flag to others. My preference would be to not do so because >>> the non-OR implementation has been tested and used on production for over a >>> year now. >>> >>> MikeA >>> >>>> On 9 Feb 2022, at 13:06, bened...@apache.org wrote: >>>> >>>> > Is there some mechanism such as experimental flags, which would allow >>>> > the SAI-only OR support to be merged into trunk >>>> >>>> FWIW, I’m OK with this merging to trunk, either hidden behind a CI-only >>>> flag or exposed to the user via some experimental flag (and a suitable >>>> NEWS.txt). We’ve discussed the need to periodically merge feature branches >>>> with trunk before they are complete. If the work is logically complete for >>>> SAI, and we’re only pending work to make OR consistent between SAI and >>>> non-SAI queries, I think that more than meets this criterion. >>>> >>>> >>>> From: Henrik Ingo <henrik.i...@datastax.com> >>>> Date: Monday, 7 February 2022 at 12:03 >>>> To: dev@cassandra.apache.org <dev@cassandra.apache.org> >>>> Subject: Re: [DISCUSS] CEP-7 Storage Attached Index >>>> >>>> Thanks Benjamin for reviewing and raising this. >>>> >>>> While I don't speak for the CEP authors, just some thoughts from me: >>>> >>>> On Mon, Feb 7, 2022 at 11:18 AM Benjamin Lerer <ble...@apache.org> wrote: >>>> I would like to raise 2 points regarding the current CEP proposal: >>>> >>>> 1. There are mention of some target versions and of the removal of SASI >>>> >>>> At this point, we have not agreed on any version numbers and I do not feel >>>> that removing SASI should be part of the proposal for now. >>>> It seems to me that we should see first the adoption surrounding SAI >>>> before talking about deprecating other solutions. >>>> >>>> >>>> This seems rather uncontroversial. I think the CEP template and previous >>>> CEPs invite the discussion on whether the new feature will or may replace >>>> an existing feature. But at the same time that's of course out of scope >>>> for the work at hand. I have no opinion one way or the other myself. >>>> >>>> >>>> 2. OR queries >>>> >>>> It is unclear to me if the proposal is about adding OR support only for >>>> SAI index or for other types of queries too. >>>> In the past, we had the nasty habit for CQL to provide only partialially >>>> implemented features which resulted in a bad user experience. >>>> Some examples are: >>>> * LIKE restrictions which were introduced for the need of SASI and were >>>> not never supported for other type of queries >>>> * IS NOT NULL restrictions for MATERIALIZED VIEWS that are not supported >>>> elsewhere >>>> * != operator only supported for conditional inserts or updates >>>> And there are unfortunately many more. >>>> >>>> We are currenlty slowly trying to fix those issue and make CQL a more >>>> mature language. By consequence, I would like that we change our way of >>>> doing things. If we introduce support for OR it should also cover all the >>>> other type of queries and be fully tested. >>>> I also believe that it is a feature that due to its complexity fully >>>> deserves its own CEP. >>>> >>>> >>>> The current code that would be submitted for review after the CEP is >>>> adopted, contains OR support beyond just SAI indexes. An initial >>>> implementation first targeted only such queries where all columns in a >>>> WHERE clause using OR needed to be backed by an SAI index. This was since >>>> extended to also support ALLOW FILTERING mode as well as OR with >>>> clustering key columns. The current implementation is by no means perfect >>>> as a general purpose OR support, the focus all the time was on >>>> implementing OR support in SAI. I'll leave it to others to enumerate >>>> exactly the limitations of the current implementation. >>>> >>>> Seeing that also Benedict supports your point of view, I would steer the >>>> conversation more into a project management perspective: >>>> * How can we advance CEP-7 so that the bulk of the SAI code can still be >>>> added to Cassandra, so that users can benefit from this new index type, >>>> albeit without OR? >>>> * This is also an important question from the point of view that this is a >>>> large block of code that will inevitably diverged if it's not in trunk. >>>> Also, merging it to trunk will allow future enhancements, including the OR >>>> syntax btw, to happen against trunk (aka upstream first). >>>> * Since OR support nevertheless is a feature of SAI, it needs to be at >>>> least unit tested, but ideally even would be exposed so that it is >>>> possible to test on the CQL level. Is there some mechanism such as >>>> experimental flags, which would allow the SAI-only OR support to be merged >>>> into trunk, while a separate CEP is focused on implementing "proper" >>>> general purpose OR support? I should note that there is no guarantee that >>>> the OR CEP would be implemented in time for the next release. So the >>>> answer to this point needs to be something that doesn't violate the desire >>>> for good user experience. >>>> >>>> henrik >>>> >>>> >>> >