Re: penn state academic paper - "scalable" bloom filters

2018-02-22 Thread Jeff Jirsa
Potentially more interesting, range filters: https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-9843 And rocksdb has a prefix bloom filter https://github.com/facebook/rocksdb/wiki/Prefix-Seek-API-Changes Which we could potentially use to track partition:partial-clustering

Re: penn state academic paper - "scalable" bloom filters

2018-02-22 Thread Jay Zhuang
> 62.7953&rep=rep1&type=pdf > > looks to be an adaptive approach where the "initial guess" bloom filters > are enhanced with more layers of ones generated after usage stats are > gained. > > Disclaimer: I suck at reading academic papers. >

penn state academic paper - "scalable" bloom filters

2018-02-22 Thread Carl Mueller
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.62.7953&rep=rep1&type=pdf looks to be an adaptive approach where the "initial guess" bloom filters are enhanced with more layers of ones generated after usage stats are gained. Disclaimer: I suck at reading academic papers.

Re: implementation choice with regard to multiple range slice query filters

2012-04-03 Thread David Alves
cool, thanks. -david On Apr 4, 2012, at 1:01 AM, Jonathan Ellis wrote: > You need more than column_index_size_in_kb worth of column data for it > to generate row header index entries. We have a cassandra.yaml in > test/conf that sets that extra low, to 4, to make that easier. "ant > test" sets

Re: implementation choice with regard to multiple range slice query filters

2012-04-03 Thread Jonathan Ellis
You need more than column_index_size_in_kb worth of column data for it to generate row header index entries. We have a cassandra.yaml in test/conf that sets that extra low, to 4, to make that easier. "ant test" sets up the environment to point to that yaml, but if you're running it from your IDE

Re: implementation choice with regard to multiple range slice query filters

2012-04-03 Thread David Alves
Hi Jonathan: Thanks for the tip. Although the first option I proposed would not incur in that penalty it would not take advantage of the columns index for the middle ranges. On a related matter, I'm struggling to test the IndexedBlockFetcher implementation (SimpleBlockF

Re: implementation choice with regard to multiple range slice query filters

2012-04-02 Thread Jonathan Ellis
That would work, but I think the best approach would actually push multiple ranges down into ISR itself, otherwise you could waste a lot of time reading the row header redundantly (the skipBloomFilter/deserializeIndex part). The tricky part would be getting IndexedBlockFetcher to not do extra work

implementation choice with regard to multiple range slice query filters

2012-04-02 Thread David Alves
Hi guys I'm a PhD student and I'm trying to dip my feet in the water wrt to cassandra development, as I'm a long time fan. I'm implementing CASSANDRA-3885 which pertains to supporting returning multiple slices of a row. After looking around at the portion of the

Re: Filters

2010-04-20 Thread Christian Torres
Thanks a lot!! On Tue, Apr 20, 2010 at 10:58 AM, Roger Schildmeijer wrote: > http://wiki.apache.org/cassandra/API#get_slice > > // Roger Schildmeijer > On 20 apr 2010, at 18.50em, Christian Torres wrote: > > > Hello! > > > > Is there any way to make filters (W

Re: Filters

2010-04-20 Thread Roger Schildmeijer
http://wiki.apache.org/cassandra/API#get_slice // Roger Schildmeijer On 20 apr 2010, at 18.50em, Christian Torres wrote: > Hello! > > Is there any way to make filters (WHEREs) in cassandra? Or I have to manages > to do it > > For example: > > I have a ColumnFamily w

Filters

2010-04-20 Thread Christian Torres
Hello! Is there any way to make filters (WHEREs) in cassandra? Or I have to manages to do it For example: I have a ColumnFamily with a column in each row whose value is a state... Public or Private, so I want to filter all rows that are private and also the public ones in other form... Beside

Re: Bloom Filters

2010-04-08 Thread Jeff Schmitz
I think D Boon from Minutemen RIP Sent from my iPhone On Apr 8, 2010, at 11:06 AM, Tatu Saloranta wrote: On Thu, Apr 8, 2010 at 5:05 AM, Ran Tavory wrote: +1 For boon. I kinda liked it... :-) Surely we can find a minor tweak to bloom, and call "enhanced version Boon filter. Plus there

Re: Bloom Filters

2010-04-08 Thread Tatu Saloranta
On Thu, Apr 8, 2010 at 5:05 AM, Ran Tavory wrote: > +1 For boon. > I kinda liked it... :-) Surely we can find a minor tweak to bloom, and call "enhanced version Boon filter. Plus there's plenty of related names from thereon! (boom, booze etc). -+ Tatu +-

Re: Bloom Filters

2010-04-08 Thread S Ahmed
:04 AM, gabriele renzi wrote: > > 2010/4/7 Peter Schüller : > > > >> (bloomfilters, not boonfilters) > >> > >> Speaking in general, not specific to cassandra: > >> > >> 2. Are boonfilters a fixed size, or they adjust as to the # of keys? >

Re: Bloom Filters

2010-04-08 Thread Ran Tavory
t;> Speaking in general, not specific to cassandra: >> >> 2. Are boonfilters a fixed size, or they adjust as to the # of keys? any >>> example size? >>> >> >> Bloom filters are by their very nature lossy in the sense that you >> cannot determine

Bloom Filters

2010-04-08 Thread Jeff Schmitz
keys? any example size? Bloom filters are by their very nature lossy in the sense that you cannot determine later what you put into it. Re-sizing a bloom filter implies re-creating it from scratch. I'm not sure what cassandra does however. i believe traditional bloom filters requires y