As I understand from the link below, burning column index-info onto the sstable index files will not only eliminate sstables but also reduce disk seeks from 3 to 2 for wide rows.
Our index files are always mmapped, so there is only one random seek for a named column query. I think that is a wonderful improvement Shouldn't we be wary of the spike in heap usage by promoting column indexes to index file? It should be nice to have say 128th entry written out to disk, while load every 512th index in memory during start-up, just as a balancing factor? -- Ravi On Tue, Sep 18, 2012 at 4:47 PM, Sylvain Lebresne <sylv...@datastax.com>wrote: > > Range queries do not use bloom filters. It holds good for > composite-columns > > also right? > > Since I assume you are referring to column's bloom filters (key's bloom > filters > are always used) then yes, that holds good for composite columns. > Currently, > composite column name are completely opaque to the storage engine. > > > <Column-part-1> alone could have gone into the bloom-filter, speeding up > my > > queries really effectively > > True, though https://issues.apache.org/jira/browse/CASSANDRA-2319 (in 1.2 > only > however) should help quite a lot here. Basically it will allow to skip the > sstable based on the column index. Granted, this is less fined grained > than a > bloom filter (though on the other side there is no false positive), but I > suspect that in most real life workload it won't be too much worse. > > -- > Sylvain >