Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01

Hiller, Dean Thu, 28 Mar 2013 11:03:02 -0700

LCS does much more "constant" compaction than STCS keeping load on the disks 
(read and write to move the data) higher.  STCS does no do as much constant 
operations.

Dean

From: Alain RODRIGUEZ <arodr...@gmail.com<mailto:arodr...@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Thursday, March 28, 2013 3:18 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01

"remember is used more IO than STS"

Are you meaning during compactions ? Because I thought that LCS should decrease 
the number of disks reads (since 90% of the data aren't spread across multiple 
sstables and C* needs to read only a file to find the entire row) while not 
compacting right ?

2013/3/28 aaron morton <aa...@thelastpickle.com<mailto:aa...@thelastpickle.com>>
You nailed it. A significant number of reads are done from hundreds of sstables 
( I have to add, compaction is apparently constantly 6000-7000 tasks behind and 
the vast majority of the reads access recently written data )
So that's not good.
If IO is saturated then maybe LCS is not for you, remember is used more IO than 
STS.
Otherwise look at the compaction yaml settings to see if you can make it go 
faster but watch out that you don't hurt normal requests.

CHeers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 28/03/2013, at 7:00 AM, Wei Zhu <wz1...@yahoo.com<mailto:wz1...@yahoo.com>> 
wrote:

Welcome to the wonderland of SSTableSize of LCS. There is some discussion 
around it, but no guidelines yet.

I asked the people in the IRC, someone is running as high as 128M on the 
production with no problem. I guess you have to test it on your system and see 
how it performs.

Attached is the related thread for your reference.

-Wei

----- Original Message -----
From: "Andras Szerdahelyi" 
<andras.szerdahe...@ignitionone.com<mailto:andras.szerdahe...@ignitionone.com>>
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Sent: Wednesday, March 27, 2013 1:19:06 AM
Subject: Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01

Aaron,

What version are you using ?

1.1.9

Have you changed the bf_ chance ? The sstables need to be rebuilt for it to 
take affect.

I did ( several times ) and I ran upgradesstables after

Not sure what this means.
Are you saying it's in a boat on a river, with tangerine trees and marmalade 
skies ?

You nailed it. A significant number of reads are done from hundreds of sstables 
( I have to add, compaction is apparently constantly 6000-7000 tasks behind and 
the vast majority of the reads access recently written data )

Take a look at the nodetool cfhistograms to get a better idea of the row size 
and use that info when consdiering the sstable size.

It's around 1-20K, what should I optimise the LCS sstable size for? I suppose 
"I want to fit as many complete rows as possible in to a single sstable to keep 
file count down while avoiding compactions of oversized ( double digit 
gigabytes? ) sstables at higher levels ? "
Do I have to run a major compaction after a change to sstable_size_in_mb ? The 
larger sstable size wouldn't really affect sstables on levels above L0 , would 
it?

Thanks!!
Andras

From: aaron morton < aa...@thelastpickle.com<mailto:aa...@thelastpickle.com> >
Reply-To: " user@cassandra.apache.org<mailto:user@cassandra.apache.org> " < 
user@cassandra.apache.org<mailto:user@cassandra.apache.org> >
Date: Tuesday 26 March 2013 21:46
To: " user@cassandra.apache.org<mailto:user@cassandra.apache.org> " < 
user@cassandra.apache.org<mailto:user@cassandra.apache.org> >
Subject: Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01

What version are you using ?
1.2.0 allowed a null bf chance, and I think it returned .1 for LCS and .01 for 
STS compaction.
Have you changed the bf_ chance ? The sstables need to be rebuilt for it to 
take affect.

and sstables read is in the skies Not sure what this means.
Are you saying it's in a boat on a river, with tangerine trees and marmalade 
skies ?

SSTable count: 22682

Lots of files there, I imagine this would dilute the effectiveness of the key 
cache. It's caching (sstable, key) tuples.
You may want to look at increasing the sstable_size with LCS.

Compacted row minimum size: 104
Compacted row maximum size: 263210

Compacted row mean size: 3041
Take a look at the nodetool cfhistograms to get a better idea of the row size 
and use that info when consdiering the sstable size.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 26/03/2013, at 6:16 AM, Andras Szerdahelyi < 
andras.szerdahe...@ignitionone.com<mailto:andras.szerdahe...@ignitionone.com> > 
wrote:

Hello list,

Could anyone shed some light on how an FP chance of 0.01 coexist with a 
measured FP ratio of .. 0.98 ? Am I reading this wrong or are 98% of the 
requests hitting the bloom filter create a false positive while the "target" 
false ratio is 0.01?
( Also key cache hit ratio is around 0.001 and sstables read is in the skies ( 
non-exponential (non-) drop off for LCS ) but that should be filed under 
"effect" and not "cause"? )

[default@unknown] use KS;
Authenticated to keyspace: KS
[default@KS] describe CF;
ColumnFamily: CF
Key Validation Class: org.apache.cassandra.db.marshal.BytesType
Default column value validator: org.apache.cassandra.db.marshal.BytesType
Columns sorted by: org.apache.cassandra.db.marshal.BytesType
GC grace seconds: 691200
Compaction min/max thresholds: 4/32
Read repair chance: 0.1
DC Local Read repair chance: 0.0
Replicate on write: true
Caching: ALL
Bloom Filter FP chance: 0.01
Built indexes: []
Compaction Strategy: 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy
Compaction Strategy Options:
sstable_size_in_mb: 5
Compression Options:
sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor

Keyspace: KS
Read Count: 628950
Read Latency: 93.19921121869784 ms.
Write Count: 1219021
Write Latency: 0.14352380885973254 ms.
Pending Tasks: 0
Column Family: CF
SSTable count: 22682
Space used (live): 119771434915
Space used (total): 119771434915
Number of Keys (estimate): 203837952
Memtable Columns Count: 13125
Memtable Data Size: 33212827
Memtable Switch Count: 15
Read Count: 629009
Read Latency: 88.434 ms.
Write Count: 1219038
Write Latency: 0.095 ms.
Pending Tasks: 0
Bloom Filter False Positives: 37939419
Bloom Filter False Ratio: 0.97928
Bloom Filter Space Used: 261572784
Compacted row minimum size: 104
Compacted row maximum size: 263210
Compacted row mean size: 3041

I upgraded sstables after changing the FP chance

Thanks!
Andras
<attachment.eml>

Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01

Reply via email to