my missunderstanding of FP ratio was based on assumption that ratio is counted from node start, while it is getRecentBloomFilterFalseRatio()

> I don't understand how you reached that conclusion.

On my nodes most memory is consumed by bloom filters. Also 1.0 creates larger bloom filters than 0.8 leading to higher memory consumption, i just checked few sstables for index to bloom filter ratio on same dataset. in 0.8 bloom filters are about 13% of index size and in 1.0, its about 16%. Key used in CF is fixed size 4byte integer.

Cassandra does not measure memory used by index sampling yet, i suspect that it will be memory hungry too and can be safely lowered by default i see very little difference by changing index sampling from 64 to 512.

Basic problem with cassandra daily administration which i am currently solving is that memory consumption grows with your dataset size. I dont really like this design - you put more data in and cluster can OOM. This makes cassandra not optimal solution for use in data archiving. It will get better after tunable bloom filters will be committed.

Reply via email to