On Sun, 2012-03-11 at 15:06 -0700, Peter Schuller wrote:
> If it is legitimate use of memory, you *may*, depending on your
> workload, want to adjust target bloom filter false positive rates:
> 
>    https://issues.apache.org/jira/browse/CASSANDRA-3497 

This particular cf has up to ~10 billion rows over 3 nodes. Each row is
very small, <1k. Data from this cf is only read via hadoop jobs in batch
reads of 16k rows at a time. 

*-Data.db files are typically ~50G, and *-Filter.db files typically 2G
although some are 7Gb.
At the moment there are many pending compactions, but i can't do any
because the node crashes at startup.

It's my understanding then for this use case that bloom filters are of
little importance and that i can 
 - upgrade to 1.0.7
 - set fp_ratio=0.99
 - set index_interval=1024

This should alleviate much of the memory problems.
Is this correct?

~mck

-- 
"It seems that perfection is reached not when there is nothing left to
add, but when there is nothing left to take away" Antoine de Saint
Exupéry (William of Ockham) 

| http://github.com/finn-no | http://tech.finn.no |

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to