You should check the 3.x release. CASSANDRA-10657 could have fixed your problem.
On Thu, Mar 22, 2018 at 9:15 PM, Benjamin Lerer <benjamin.le...@datastax.com > wrote: > Syvlain explained the problem in CASSANDRA-4536: > " Let me note that in CQL3 a row that have no live column don't exist, so > we can't really implement this with a range slice having an empty columns > list. Instead we should do a range slice with a full-row slice predicate > with a count of 1, to make sure we do have a live column before including > the partition key. " > > By using ColumnFilter.selectionBuilder(); you do not select all the > columns. By consequence, some partitions might be returned while they > should not. > > On Thu, Mar 22, 2018 at 6:24 PM, Sam Klock <skl...@akamai.com> wrote: > >> Cassandra devs, >> >> We use workflows in some of our clusters (running 3.0.15) that involve >> "SELECT DISTINCT key FROM..."-style queries. For some tables, we >> observed extremely poor performance under light load (i.e., a small >> number of rows per second and frequent timeouts), which we eventually >> traced to replicas shipping entire rows (which in some cases could store >> on the order of MBs of data) to service the query. That surprised us >> (partly because 2.1 doesn't seem to behave this way), so we did some >> digging, and we eventually came up with a patch that modifies >> SelectStatement.java in the following way: if the selection in the query >> only includes the partition key, then when building a ColumnFilter for >> the query, use: >> >> builder = ColumnFilter.selectionBuilder(); >> >> instead of: >> >> builder = ColumnFilter.allColumnsBuilder(); >> >> to initialize the ColumnFilter.Builder in gatherQueriedColumns(). That >> seems to repair the performance regression, and it doesn't appear to >> break any functionality (based on the unit tests and some smoke tests we >> ran involving insertions and deletions). >> >> We'd like to contribute this patch back to the project, but we're not >> convinced that there aren't subtle correctness issues we're missing, >> judging both from comments in the code and the existence of >> CASSANDRA-5912, which suggests optimizing this kind of query is >> nontrivial. >> >> So: does this change sound safe to make, or are there corner cases we >> need to account for? If there are corner cases, are there plausibly >> ways of addressing them at the SelectStatement level, or will we need to >> look deeper? >> >> Thanks, >> SK >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: dev-h...@cassandra.apache.org >> >> >