[ https://issues.apache.org/jira/browse/CASSANDRA-20639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Caleb Rackliffe updated CASSANDRA-20639: ---------------------------------------- Attachment: ci_summary.html result_details.tar.gz > Replica filtering protection can trigger short-read protection too > aggressively when the LIMIT is less than the number of results in a partition > ------------------------------------------------------------------------------------------------------------------------------------------------ > > Key: CASSANDRA-20639 > URL: https://issues.apache.org/jira/browse/CASSANDRA-20639 > Project: Apache Cassandra > Issue Type: Improvement > Components: Consistency/Coordination, Feature/SAI > Reporter: Caleb Rackliffe > Assignee: Caleb Rackliffe > Priority: Normal > Fix For: 5.0.x, 5.x > > Attachments: ci_summary.html, result_details.tar.gz > > Time Spent: 40m > Remaining Estimate: 0h > > {{ReplicaFilteringProtection#queryProtectedPartitions()}} provides > "completed" partitions to the {{DataResolver}} in two steps. First, it > consumes the initial merged query results from the replicas, via a > {{PartitionIterator}} which is short-read protected. As it does this, it > consumes all matches in a partition. This forces the row data through RFP's > merge listener, which catalogs the places where replicas are "silent" marks > them for completion. Second, PartitionBuilder uses this information to > complete the partition with data from the replicas that provided ambiguous > results. > The problem here is in the first step. When the total number of matches in a > large partition is a large multiple of the LIMIT, consuming all the marches > in the partition triggers a flurry of short-read protection reads to any > replicas that actually provided enough results to hit the limit. This problem > is somewhat mitigated by CASSANDRA-20566 if we can use strict filtering and > therefore {{SinglePartitionReadCommand}}, where digest matches bypass RFP > altogether. (This would be especially likely with small limits and reasonably > repaired data.) > Here's a short test that should hit all of this: > (Just put a breakpoint in {{queryProtectedPartitions()}} in {{hasNext()}} and > then in {{ShortReadPartitionsProtection#executeReadCommand()}} to see SRP > reads being sent.) > {noformat} > @Test > public void testShortReadNoSRP() > { > CLUSTER.schemaChange(withKeyspace("CREATE TABLE %s.short_read_no_srp (k > int, c int, a int, b int, PRIMARY KEY (k, c)) WITH read_repair = 'NONE'")); > CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON > %s.short_read_no_srp(a) USING 'sai'")); > CLUSTER.schemaChange(withKeyspace("CREATE INDEX ON > %s.short_read_no_srp(b) USING 'sai'")); > SAIUtil.waitForIndexQueryable(CLUSTER, KEYSPACE); > CLUSTER.get(1).executeInternal(withKeyspace("INSERT INTO > %s.short_read_no_srp(k, c, a) VALUES (0, 2, 1) USING TIMESTAMP 5")); > String select = withKeyspace("SELECT * FROM %s.short_read_no_srp WHERE k > = 0 AND a = 1"); > Iterator<Object[]> initialRows = > CLUSTER.coordinator(1).executeWithPaging(select, ConsistencyLevel.ALL, 1); > assertRows(initialRows, row(0, 2, 1, null)); > } > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org