Caleb Rackliffe created CASSANDRA-20639: -------------------------------------------
Summary: Replica filtering protection can trigger short-read protection too aggressively when the LIMIT is less than the number of results in a partition Key: CASSANDRA-20639 URL: https://issues.apache.org/jira/browse/CASSANDRA-20639 Project: Apache Cassandra Issue Type: Improvement Components: Consistency/Coordination, Feature/SAI Reporter: Caleb Rackliffe Assignee: Caleb Rackliffe {{ReplicaFilteringProtection#queryProtectedPartitions()}} provides "completed" partitions to the {{DataResolver}} in two steps. First, it consumes the initial merged query results from the replicas, via a {{PartitionIterator}} which is short-read protected. As it does this, it consumes all matches in a partition. This forces the row data through RFP's merge listener, which catalogs the places where replicas are "silent" marks them for completion. Second, PartitionBuilder uses this information to complete the partition with data from the replicas that provided ambiguous results. The problem here is in the first step. When the total number of matches in a large partition is a large multiple of the LIMIT, consuming all the marches in the partition triggers a flurry of short-read protection reads to any replicas that actually provided enough results to hit the limit. This problem is somewhat mitigated by CASSANDRA-20566 if we can use strict filtering and therefore {{SinglePartitionReadCommand}}, where digest matches bypass RFP altogether. (This would be especially likely with small limits and reasonably repaired data.) Here's a short test that should hit all of this: (Just put a breakpoint in {{queryProtectedPartitions()}} in {{hasNext()}} and then in {{ShortReadPartitionsProtection#executeReadCommand()}} to see SRP reads being sent.) {noformat} @Test public void testShortReadNoSRP() { CLUSTER.schemaChange(withKeyspace("CREATE TABLE %s.short_read_no_srp (k int, c int, a int, b int, PRIMARY KEY (k, c)) WITH read_repair = 'NONE'")); CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON %s.short_read_no_srp(a) USING 'sai'")); CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON %s.short_read_no_srp(b) USING 'sai'")); SAIUtil.waitForIndexQueryable(CLUSTER, KEYSPACE); CLUSTER.get(1).executeInternal(withKeyspace("INSERT INTO %s.short_read_no_srp(k, c, a) VALUES (0, 2, 1) USING TIMESTAMP 5")); String select = withKeyspace("SELECT * FROM %s.short_read_no_srp WHERE k = 0 AND a = 1"); Iterator<Object[]> initialRows = CLUSTER.coordinator(1).executeWithPaging(select, ConsistencyLevel.ALL, 1); assertRows(initialRows, row(0, 2, 1, null)); } {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org