Optimizing queries for partition keys

Sam Klock Thu, 22 Mar 2018 10:25:11 -0700

Cassandra devs,

We use workflows in some of our clusters (running 3.0.15) that involve
"SELECT DISTINCT key FROM..."-style queries.  For some tables, we
observed extremely poor performance under light load (i.e., a small
number of rows per second and frequent timeouts), which we eventually
traced to replicas shipping entire rows (which in some cases could store
on the order of MBs of data) to service the query.  That surprised us
(partly because 2.1 doesn't seem to behave this way), so we did some
digging, and we eventually came up with a patch that modifies
SelectStatement.java in the following way: if the selection in the query
only includes the partition key, then when building a ColumnFilter for
the query, use:


    builder = ColumnFilter.selectionBuilder();

instead of:

    builder = ColumnFilter.allColumnsBuilder();

to initialize the ColumnFilter.Builder in gatherQueriedColumns().  That
seems to repair the performance regression, and it doesn't appear to
break any functionality (based on the unit tests and some smoke tests we
ran involving insertions and deletions).

We'd like to contribute this patch back to the project, but we're not
convinced that there aren't subtle correctness issues we're missing,
judging both from comments in the code and the existence of
CASSANDRA-5912, which suggests optimizing this kind of query is nontrivial.

So: does this change sound safe to make, or are there corner cases we
need to account for?  If there are corner cases, are there plausibly
ways of addressing them at the SelectStatement level, or will we need to
look deeper?

Thanks,
SK

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Optimizing queries for partition keys

Reply via email to