Dne 11.11.2011 7:55, Radim Kolar napsal(a):
i have problem with large CF (about 200 billions entries per node). While i can configure index_interval to lower memory requirements, i still have to stick with huge bloom filters.

Ideal would be to have bloom filters configurable like in hbase. Cassandra standard is about 1.05% false possitive but in my case i would be fine even with 20% false positive rate. Data are not often read back. Most of them will be never read before they expire via TTL.
anybody other has problem that bloom filters are using too much memory in applications which do not needs to read written data often?

I am looking at bloom filters memory used and it would be ideal to have in cassandra-1.1 ability to shrink bloom filters to about 1/10 of their size. Is possible to code something like this: save bloom filters to disk as usual but during load, transform them into something smaller at cost increasing FP rate?

Reply via email to