Re: configurable bloom filters (like hbase)

Radim Kolar Wed, 14 Dec 2011 02:53:17 -0800

Dne 11.11.2011 7:55, Radim Kolar napsal(a):

i have problem with large CF (about 200 billions entries per node).While i can configure index_interval to lower memory requirements, istill have to stick with huge bloom filters.
Ideal would be to have bloom filters configurable like in hbase.Cassandra standard is about 1.05% false possitive but in my case iwould be fine even with 20% false positive rate. Data are not oftenread back. Most of them will be never read before they expire via TTL.

anybody other has problem that bloom filters are using too much memoryin applications which do not needs to read written data often?

I am looking at bloom filters memory used and it would be ideal to havein cassandra-1.1 ability to shrink bloom filters to about 1/10 of theirsize. Is possible to code something like this: save bloom filters todisk as usual but during load, transform them into something smaller atcost increasing FP rate?

Re: configurable bloom filters (like hbase)

Reply via email to