https://issues.apache.org/jira/browse/CASSANDRA-3497
On Wed, Dec 14, 2011 at 4:52 AM, Radim Kolar <h...@sendmail.cz> wrote: > Dne 11.11.2011 7:55, Radim Kolar napsal(a): > >> i have problem with large CF (about 200 billions entries per node). While >> i can configure index_interval to lower memory requirements, i still have to >> stick with huge bloom filters. >> >> Ideal would be to have bloom filters configurable like in hbase. Cassandra >> standard is about 1.05% false possitive but in my case i would be fine even >> with 20% false positive rate. Data are not often read back. Most of them >> will be never read before they expire via TTL. > > anybody other has problem that bloom filters are using too much memory in > applications which do not needs to read written data often? > > I am looking at bloom filters memory used and it would be ideal to have in > cassandra-1.1 ability to shrink bloom filters to about 1/10 of their size. > Is possible to code something like this: save bloom filters to disk as usual > but during load, transform them into something smaller at cost increasing FP > rate?