[jira] [Commented] (CASSANDRA-20189) Avoid possible consistency violations for SAI intersection queries over repaired index matches and multiple non-indexed column matches

Caleb Rackliffe (Jira) Mon, 27 Jan 2025 17:26:25 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-20189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17921580#comment-17921580
 ]


Caleb Rackliffe commented on CASSANDRA-20189:
---------------------------------------------

It turns out the reason the seeds above were failing is that Harry was 
generating range queries over SAI column indexes that didn't support range 
queries. That meant we just degraded to AF, and since those are still broken 
(see CASSANDRA-19007), we ran into failures. Once I started making sure we only 
generated queries SAI supports, things started running clean. In any case, the 
patch is ready for review, it fixes what we set out to fix here (while 
expanding the Harry tests), and I'll have some CI results shortly...

> Avoid possible consistency violations for SAI intersection queries over 
> repaired index matches and multiple non-indexed column matches
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-20189
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20189
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Consistency/Coordination, Feature/SAI
>            Reporter: Caleb Rackliffe
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>             Fix For: 5.0.x, 5.x
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{FilterTree}}, which is responsible for SAI's post-filtering, is too 
> aggressive about automatically using strict filtering when a.) only repaired 
> matches are returned from index columns and b.) there are still multiple 
> non-indexed columns that must be post-filtered. The following test 
> illustrates this:
> {noformat}
> @Test
> public void testPartialUpdatesOnNonIndexedColumnsAfterRepair()
> {
>    CLUSTER.schemaChange(withKeyspace("CREATE TABLE %s.partial_updates (k int 
> PRIMARY KEY, a int, b int, c int) WITH read_repair = 'NONE'"));
>    CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON %s.partial_updates(a) 
> USING 'sai'"));
>    SAIUtil.waitForIndexQueryable(CLUSTER, KEYSPACE);
>    CLUSTER.coordinator(1).execute(withKeyspace("INSERT INTO 
> %s.partial_updates(k, a) VALUES (0, 1) USING TIMESTAMP 1"), 
> ConsistencyLevel.ALL);
>    CLUSTER.get(1).nodetoolResult("repair", KEYSPACE).asserts().success();
>    // insert a split row
>    CLUSTER.get(1).executeInternal(withKeyspace("INSERT INTO 
> %s.partial_updates(k, b) VALUES (0, 2) USING TIMESTAMP 2"));
>    CLUSTER.get(2).executeInternal(withKeyspace("INSERT INTO 
> %s.partial_updates(k, c) VALUES (0, 3) USING TIMESTAMP 3"));
>    String select = withKeyspace("SELECT * FROM %s.partial_updates WHERE a = 1 
> AND b = 2 AND c = 3 ALLOW FILTERING");
>    Object[][] initialRows = CLUSTER.coordinator(1).execute(select, 
> ConsistencyLevel.ALL);
>    assertRows(initialRows, row(0, 1, 2, 3));
> }
> {noformat}
> This should be easy to fix without adversely affecting performance too much, 
> given the selectivity of the clauses on the indexed columns still determines 
> the upper bound on how many rows can be returned to the coordinator for 
> coordinator-side filtering.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-20189) Avoid possible consistency violations for SAI intersection queries over repaired index matches and multiple non-indexed column matches

Reply via email to