Secondary indexes are there for convenience, not performance. If you're looking for something performant, you'll need to maintain your own indexes.
On Mon Dec 29 2014 at 3:22:58 PM Sam Klock <skl...@akamai.com> wrote: > Hi folks, > > Perhaps this is a question better addressed to the Cassandra developers > directly, but I thought I'd ask it here first. We've recently been > benchmarking certain uses of secondary indexes in Cassandra 2.1.x, and > we've noticed that when the number of items in an index reaches beyond > some threshold (perhaps several tens of thousands depending on the > cardinality) performance begins to degrade substantially. This is > particularly the case when the client does things it probably shouldn't > do (like manually paginate results), but we suspect there's at least > one issue in Cassandra having an impact here that we'd like to > understand better. > > Our investigation led us to logic in Cassandra used to paginate scans > of rows in indexes on composites. The issue seems to be the short > algorithm Cassandra uses to select the size of the pages for the scan, > partially given on the following two lines (from > o.a.c.db.index.composites.CompositesSearcher): > > private int meanColumns = Math.max(index.getIndexCfs().getMeanColumns(), > 1); > private int rowsPerQuery = Math.max(Math.min(filter.maxRows(), > filter.maxColumns() / meanColumns), 2); > > The value computed for rowsPerQuery appears to be the page size. > > Based on our reading of the code, unless the value obtained for > meanColumns is very small, a large query-level page size is used, or > the DISTINCT keyword is used, the value for (filter.maxColumns() / > meanColumns) always ends up being small enough that the page size is > 2. This seems to be the case both for very low-cardinality indexes > (two different indexed values) and for indexes with higher > cardinalities as long as the number of entries per index row is more > than a few thousand. > > The fact that we consistently get such a small page size appears to > have a substantial impact on performance. The overhead is simply > devastating, especially since it looks like the pages are likely to > overlap with each other (the last element of one page is the first > element of the next). To wit: if we fix the index page size in code to > a very large number, index queries in our environment that prior > required over two minutes to complete can finish in under ten seconds. > > Some (but probably not this much) overhead might be acceptable if the > algorithm is intended to achieve other worthy goals (safety?). But > what's puzzling to us is that we can't figure out what it's intended to > do. We suspect the algorithm is simply buggy, but we'd like insight > from knowledgeable parties before we draw that conclusion and try to > find a different solution. > > Does anyone here have relevant experience with secondary indexes that > might shed light on the design choice here? In particular, can anyone > (perhaps the developers?) explain what this algorithm is intended to do > and what we might do to safely get around this limitation? > > Also (to the developers watching this list): is this the sort of > question we should be addressing to the dev list directly? > > Thanks, > SK >