Re: improving read performance

Peter Schuller Mon, 20 Sep 2010 12:04:48 -0700

> Actually, the points you make are things I have overlooked and actually make
> me feel more comfortable about how cassandra will perform for my use cases.
>   I'm interested, in my case, to find out what the bloom filter
> false-positive rate is.   Hopefully, a stat is kept on this.


Assuming lack of implementation bugs and a good enough hash algorithm,
 the false positive rate of bloom filters are mathematically
determined. See:

   http://pages.cs.wisc.edu/~cao/papers/summary-cache/node8.html

And in cassandra:

   java/org/apache/cassandra/utils/BloomCalculations.java
   java/org/apache/cassandra/utils/BloomFilter.java

(I don't know without checking (no time right now) whether the false
positive rate is actually tracked or not.)

>   As long as
> ALL of the bloom filters are in memory, the hit should be minimal  for a

Bloom filters are by design in memory at all times (they are the worst
possible case you can imagine in terms of random access, so it would
never make sense to keep them on disk even partially).

(This assumes the JVM isn't being otherwise swapped out, which is
another issue.)

> Good point on the row cache.   I had actually misread the comments in the
> yaml, mistaking "do not use on ColumnFamilies with LARGE ROWS" , as "do not
> use on ColumnFamilies with a LARGE NUMBER OF ROWS".    I don't know if this
> will improve performance much since I don't understand yet if this
> eliminates the need to check for the data in the SStables.   If it doesn't
> then what is the point of the row cache since the data is also in an
> in-memory memtable?

It does eliminate the need to go down to sstables. It also survives
compactions (so doesn't go cold when sstables are replaced).

Reasons to not use the row cache with large rows include:

* In general it's a waste of memory better given to the OS page cache,
unless possibly you're continually reading entire rows rather than
subsets of rows.

* For truly large rows you may have immediate issues with the size of
the data being cached; e.g. attempting to cache a 2 GB row is not the
best idea in terms of heap space consumption; you'll likely OOM or
trigger fallbacks to full GC, etc.

* Having a larger key cache may often be more productive.

> That aside, splitting the memtable in 2, could make checking the bloom
> filters unnecessary in most cases for me, but I'm not sure it's worth the
> effort.

Write-through row caching seems like a more direct approach to me
personally, off hand. Also to the extent that you're worried about
false positive rates, larger bloom filters may still be an option (not
currently configurable; would require source changes).

-- 
/ Peter Schuller

Re: improving read performance

Reply via email to