Cassandra devs, We use workflows in some of our clusters (running 3.0.15) that involve "SELECT DISTINCT key FROM..."-style queries. For some tables, we observed extremely poor performance under light load (i.e., a small number of rows per second and frequent timeouts), which we eventually traced to replicas shipping entire rows (which in some cases could store on the order of MBs of data) to service the query. That surprised us (partly because 2.1 doesn't seem to behave this way), so we did some digging, and we eventually came up with a patch that modifies SelectStatement.java in the following way: if the selection in the query only includes the partition key, then when building a ColumnFilter for the query, use:
builder = ColumnFilter.selectionBuilder(); instead of: builder = ColumnFilter.allColumnsBuilder(); to initialize the ColumnFilter.Builder in gatherQueriedColumns(). That seems to repair the performance regression, and it doesn't appear to break any functionality (based on the unit tests and some smoke tests we ran involving insertions and deletions). We'd like to contribute this patch back to the project, but we're not convinced that there aren't subtle correctness issues we're missing, judging both from comments in the code and the existence of CASSANDRA-5912, which suggests optimizing this kind of query is nontrivial. So: does this change sound safe to make, or are there corner cases we need to account for? If there are corner cases, are there plausibly ways of addressing them at the SelectStatement level, or will we need to look deeper? Thanks, SK --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org