[jira] [Updated] (CASSANDRA-20639) Replica filtering protection can trigger short-read protection too aggressively when the LIMIT is less than the number of results in a partition

Caleb Rackliffe (Jira) Tue, 13 May 2025 09:34:08 -0700


     [ 
https://issues.apache.org/jira/browse/CASSANDRA-20639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Caleb Rackliffe updated CASSANDRA-20639:
----------------------------------------
    Attachment: ci_summary.html
                result_details.tar.gz

> Replica filtering protection can trigger short-read protection too 
> aggressively when the LIMIT is less than the number of results in a partition
> ------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-20639
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20639
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Consistency/Coordination, Feature/SAI
>            Reporter: Caleb Rackliffe
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>             Fix For: 5.0.x, 5.x
>
>         Attachments: ci_summary.html, result_details.tar.gz
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> {{ReplicaFilteringProtection#queryProtectedPartitions()}} provides 
> "completed" partitions to the {{DataResolver}} in two steps. First, it 
> consumes the initial merged query results from the replicas, via a 
> {{PartitionIterator}} which is short-read protected. As it does this, it 
> consumes all matches in a partition. This forces the row data through RFP's 
> merge listener, which catalogs the places where replicas are "silent" marks 
> them for completion. Second, PartitionBuilder uses this information to 
> complete the partition with data from the replicas that provided ambiguous 
> results.
> The problem here is in the first step. When the total number of matches in a 
> large partition is a large multiple of the LIMIT, consuming all the marches 
> in the partition triggers a flurry of short-read protection reads to any 
> replicas that actually provided enough results to hit the limit. This problem 
> is somewhat mitigated by CASSANDRA-20566 if we can use strict filtering and 
> therefore {{SinglePartitionReadCommand}}, where digest matches bypass RFP 
> altogether. (This would be especially likely with small limits and reasonably 
> repaired data.)
> Here's a short test that should hit all of this:
> (Just put a breakpoint in {{queryProtectedPartitions()}} in {{hasNext()}} and 
> then in {{ShortReadPartitionsProtection#executeReadCommand()}} to see SRP 
> reads being sent.)
> {noformat}
> @Test
> public void testShortReadNoSRP()
> {
>     CLUSTER.schemaChange(withKeyspace("CREATE TABLE %s.short_read_no_srp (k 
> int, c int, a int, b int, PRIMARY KEY (k, c)) WITH read_repair = 'NONE'"));
>     CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON 
> %s.short_read_no_srp(a) USING 'sai'"));
>     CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON 
> %s.short_read_no_srp(b) USING 'sai'"));
>     SAIUtil.waitForIndexQueryable(CLUSTER, KEYSPACE);
>     CLUSTER.get(1).executeInternal(withKeyspace("INSERT INTO 
> %s.short_read_no_srp(k, c, a) VALUES (0, 2, 1) USING TIMESTAMP 5"));
>     String select = withKeyspace("SELECT * FROM %s.short_read_no_srp WHERE k 
> = 0 AND a = 1");
>     Iterator<Object[]> initialRows = 
> CLUSTER.coordinator(1).executeWithPaging(select, ConsistencyLevel.ALL, 1);
>     assertRows(initialRows, row(0, 2, 1, null));
> }
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-20639) Replica filtering protection can trigger short-read protection too aggressively when the LIMIT is less than the number of results in a partition

Reply via email to