On Sun, 2012-03-11 at 15:06 -0700, Peter Schuller wrote: > If it is legitimate use of memory, you *may*, depending on your > workload, want to adjust target bloom filter false positive rates: > > https://issues.apache.org/jira/browse/CASSANDRA-3497
This particular cf has up to ~10 billion rows over 3 nodes. Each row is very small, <1k. Data from this cf is only read via hadoop jobs in batch reads of 16k rows at a time. *-Data.db files are typically ~50G, and *-Filter.db files typically 2G although some are 7Gb. At the moment there are many pending compactions, but i can't do any because the node crashes at startup. It's my understanding then for this use case that bloom filters are of little importance and that i can - upgrade to 1.0.7 - set fp_ratio=0.99 - set index_interval=1024 This should alleviate much of the memory problems. Is this correct? ~mck -- "It seems that perfection is reached not when there is nothing left to add, but when there is nothing left to take away" Antoine de Saint Exupéry (William of Ockham) | http://github.com/finn-no | http://tech.finn.no |
signature.asc
Description: This is a digitally signed message part