Mostly but not 100%. You have a bloom filter for each sstable, so
"going to disk" means finding the row in each sstable if you end up
skipping some you are better off. Sometimes you have the data but not
in sstable N. The bloom filter helps avoid checking sstable N to find
nothing.

On Mon, Feb 25, 2013 at 8:27 AM, Hiller, Dean <dean.hil...@nrel.gov> wrote:
> Hmmmm, I thought bloomfilters only help on missing rows.  Any time we look up 
> a row, we know it is there in our case as it would not be in the other table. 
>  I would say statistically 99.9% of the time the row is there and we are okay 
> with 0.1% of the time wasting hitting the disk.
>
> Do I have this correct though?  Bloomfilters really only help me if the data 
> is not there so I don't have to go to the disk and find that out.
>
> Thanks,
> Dean
>
> From: aaron morton <aa...@thelastpickle.com<mailto:aa...@thelastpickle.com>>
> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
> <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Date: Sunday, February 24, 2013 7:09 PM
> To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
> <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
> Subject: Re: disabling bloomfilter not working? or did I do this wrong?
>
> Yeah, disabling completely is probably not great.
> There is some wriggle room between disabled and "less memory"
>
> Did I link to this bloom filter calculator ? http://hur.st/bloomfilter also 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/BloomCalculations.java
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 23/02/2013, at 12:10 PM, Bryan Talbot 
> <btal...@aeriagames.com<mailto:btal...@aeriagames.com>> wrote:
>
> I see from your read and write count that your nreldata CF has nearly equal 
> number of reads as writes.  I would expect that disabling your bloom filter 
> is going to hurt your read performance quite a bit.
>
> Also, beware that disabling your bloom filter may also cause tombstoned rows 
> to never be deleted, so if you delete all columns explicitly or use TTL, your 
> data may grow more than your expect.  
> https://issues.apache.org/jira/browse/CASSANDRA-5182
>
> -Bryan
>
>
>
>
> On Fri, Feb 22, 2013 at 11:59 AM, Hiller, Dean 
> <dean.hil...@nrel.gov<mailto:dean.hil...@nrel.gov>> wrote:
> Thanks, but I found out it is still running.  It looks like I have about a 5 
> hour wait left for my upgradesstables(waited 4 hours already).  I will check 
> the bloomfilter after that.
>
> Out of curiosity, if I had much wider rows (ie. < 900k) per row, will 
> compaction run faster(errrr…upgradesstables) at all or would it basically run 
> at the same speed.
>
> I guess what I am wondering is 9 hours a normal compaction time for 130gb of 
> data?
>
> Thanks,
> Dean
>
> From: aaron morton 
> <aa...@thelastpickle.com<mailto:aa...@thelastpickle.com><mailto:aa...@thelastpickle.com<mailto:aa...@thelastpickle.com>>>
> Reply-To: 
> "user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>"
>  
> <user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
> Date: Friday, February 22, 2013 10:29 AM
> To: 
> "user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>"
>  
> <user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
> Subject: Re: disabling bloomfilter not working? or did I do this wrong?
>
> Bloom Filter Space Used: 2318392048<tel:2318392048>
> Just to be sane do a quick check of the -Filter.db files on disk for this CF.
> If they are very small try a restart on the node.
>
> Number of Keys (estimate): 1249133696
> Hey a billion rows on a node, what an age we live in :)
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com<http://www.thelastpickle.com/>
>
> On 23/02/2013, at 4:35 AM, "Hiller, Dean" 
> <dean.hil...@nrel.gov<mailto:dean.hil...@nrel.gov><mailto:dean.hil...@nrel.gov<mailto:dean.hil...@nrel.gov>>>
>  wrote:
>
> So in the cli, I ran
>
> update column family nreldata with bloom_filter_fp_chance=1.0;
>
> Then I ran
>
> nodetool upgradesstables databus5 nreldata;
>
> But my bloom filter size is still around 2gig(and I want to free up this 
> heap)!!!! According to nodetool cfstats command…
>
> Column Family: nreldata
> SSTable count: 10
> Space used (live): 96841497731
> Space used (total): 96841497731
> Number of Keys (estimate): 1249133696
> Memtable Columns Count: 7066
> Memtable Data Size: 4286174
> Memtable Switch Count: 924
> Read Count: 19087150
> Read Latency: 0.595 ms.
> Write Count: 21281994
> Write Latency: 0.013 ms.
> Pending Tasks: 0
> Bloom Filter False Postives: 974393
> Bloom Filter False Ratio: 0.99998
> Bloom Filter Space Used: 2318392048
> Compacted row minimum size: 73
> Compacted row maximum size: 446
> Compacted row mean size: 143
>
>
>
>
>
>

Reply via email to