Re: reported bloom filter FP ratio

Radim Kolar Mon, 26 Dec 2011 10:57:34 -0800

my missunderstanding of FP ratio was based on assumption that ratio iscounted from node start, while it is getRecentBloomFilterFalseRatio()


> I don't understand how you reached that conclusion.

On my nodes most memory is consumed by bloom filters. Also 1.0 createslarger bloom filters than 0.8 leading to higher memory consumption, ijust checked few sstables for index to bloom filter ratio on samedataset. in 0.8 bloom filters are about 13% of index size and in 1.0,its about 16%. Key used in CF is fixed size 4byte integer.

Cassandra does not measure memory used by index sampling yet, i suspectthat it will be memory hungry too and can be safely lowered by default isee very little difference by changing index sampling from 64 to 512.

Basic problem with cassandra daily administration which i am currentlysolving is that memory consumption grows with your dataset size. I dontreally like this design - you put more data in and cluster can OOM. Thismakes cassandra not optimal solution for use in data archiving. It willget better after tunable bloom filters will be committed.

Re: reported bloom filter FP ratio

Reply via email to