> Actually, the points you make are things I have overlooked and actually make > me feel more comfortable about how cassandra will perform for my use cases. > I'm interested, in my case, to find out what the bloom filter > false-positive rate is. Hopefully, a stat is kept on this.
Assuming lack of implementation bugs and a good enough hash algorithm, the false positive rate of bloom filters are mathematically determined. See: http://pages.cs.wisc.edu/~cao/papers/summary-cache/node8.html And in cassandra: java/org/apache/cassandra/utils/BloomCalculations.java java/org/apache/cassandra/utils/BloomFilter.java (I don't know without checking (no time right now) whether the false positive rate is actually tracked or not.) > As long as > ALL of the bloom filters are in memory, the hit should be minimal for a Bloom filters are by design in memory at all times (they are the worst possible case you can imagine in terms of random access, so it would never make sense to keep them on disk even partially). (This assumes the JVM isn't being otherwise swapped out, which is another issue.) > Good point on the row cache. I had actually misread the comments in the > yaml, mistaking "do not use on ColumnFamilies with LARGE ROWS" , as "do not > use on ColumnFamilies with a LARGE NUMBER OF ROWS". I don't know if this > will improve performance much since I don't understand yet if this > eliminates the need to check for the data in the SStables. If it doesn't > then what is the point of the row cache since the data is also in an > in-memory memtable? It does eliminate the need to go down to sstables. It also survives compactions (so doesn't go cold when sstables are replaced). Reasons to not use the row cache with large rows include: * In general it's a waste of memory better given to the OS page cache, unless possibly you're continually reading entire rows rather than subsets of rows. * For truly large rows you may have immediate issues with the size of the data being cached; e.g. attempting to cache a 2 GB row is not the best idea in terms of heap space consumption; you'll likely OOM or trigger fallbacks to full GC, etc. * Having a larger key cache may often be more productive. > That aside, splitting the memtable in 2, could make checking the bloom > filters unnecessary in most cases for me, but I'm not sure it's worth the > effort. Write-through row caching seems like a more direct approach to me personally, off hand. Also to the extent that you're worried about false positive rates, larger bloom filters may still be an option (not currently configurable; would require source changes). -- / Peter Schuller