LCS does much more "constant" compaction than STCS keeping load on the disks (read and write to move the data) higher. STCS does no do as much constant operations.
Dean From: Alain RODRIGUEZ <arodr...@gmail.com<mailto:arodr...@gmail.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Date: Thursday, March 28, 2013 3:18 AM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Subject: Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01 "remember is used more IO than STS" Are you meaning during compactions ? Because I thought that LCS should decrease the number of disks reads (since 90% of the data aren't spread across multiple sstables and C* needs to read only a file to find the entire row) while not compacting right ? 2013/3/28 aaron morton <aa...@thelastpickle.com<mailto:aa...@thelastpickle.com>> You nailed it. A significant number of reads are done from hundreds of sstables ( I have to add, compaction is apparently constantly 6000-7000 tasks behind and the vast majority of the reads access recently written data ) So that's not good. If IO is saturated then maybe LCS is not for you, remember is used more IO than STS. Otherwise look at the compaction yaml settings to see if you can make it go faster but watch out that you don't hurt normal requests. CHeers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 28/03/2013, at 7:00 AM, Wei Zhu <wz1...@yahoo.com<mailto:wz1...@yahoo.com>> wrote: Welcome to the wonderland of SSTableSize of LCS. There is some discussion around it, but no guidelines yet. I asked the people in the IRC, someone is running as high as 128M on the production with no problem. I guess you have to test it on your system and see how it performs. Attached is the related thread for your reference. -Wei ----- Original Message ----- From: "Andras Szerdahelyi" <andras.szerdahe...@ignitionone.com<mailto:andras.szerdahe...@ignitionone.com>> To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Sent: Wednesday, March 27, 2013 1:19:06 AM Subject: Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01 Aaron, What version are you using ? 1.1.9 Have you changed the bf_ chance ? The sstables need to be rebuilt for it to take affect. I did ( several times ) and I ran upgradesstables after Not sure what this means. Are you saying it's in a boat on a river, with tangerine trees and marmalade skies ? You nailed it. A significant number of reads are done from hundreds of sstables ( I have to add, compaction is apparently constantly 6000-7000 tasks behind and the vast majority of the reads access recently written data ) Take a look at the nodetool cfhistograms to get a better idea of the row size and use that info when consdiering the sstable size. It's around 1-20K, what should I optimise the LCS sstable size for? I suppose "I want to fit as many complete rows as possible in to a single sstable to keep file count down while avoiding compactions of oversized ( double digit gigabytes? ) sstables at higher levels ? " Do I have to run a major compaction after a change to sstable_size_in_mb ? The larger sstable size wouldn't really affect sstables on levels above L0 , would it? Thanks!! Andras From: aaron morton < aa...@thelastpickle.com<mailto:aa...@thelastpickle.com> > Reply-To: " user@cassandra.apache.org<mailto:user@cassandra.apache.org> " < user@cassandra.apache.org<mailto:user@cassandra.apache.org> > Date: Tuesday 26 March 2013 21:46 To: " user@cassandra.apache.org<mailto:user@cassandra.apache.org> " < user@cassandra.apache.org<mailto:user@cassandra.apache.org> > Subject: Re: bloom filter fp ratio of 0.98 with fp_chance of 0.01 What version are you using ? 1.2.0 allowed a null bf chance, and I think it returned .1 for LCS and .01 for STS compaction. Have you changed the bf_ chance ? The sstables need to be rebuilt for it to take affect. and sstables read is in the skies Not sure what this means. Are you saying it's in a boat on a river, with tangerine trees and marmalade skies ? SSTable count: 22682 Lots of files there, I imagine this would dilute the effectiveness of the key cache. It's caching (sstable, key) tuples. You may want to look at increasing the sstable_size with LCS. Compacted row minimum size: 104 Compacted row maximum size: 263210 Compacted row mean size: 3041 Take a look at the nodetool cfhistograms to get a better idea of the row size and use that info when consdiering the sstable size. Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 26/03/2013, at 6:16 AM, Andras Szerdahelyi < andras.szerdahe...@ignitionone.com<mailto:andras.szerdahe...@ignitionone.com> > wrote: Hello list, Could anyone shed some light on how an FP chance of 0.01 coexist with a measured FP ratio of .. 0.98 ? Am I reading this wrong or are 98% of the requests hitting the bloom filter create a false positive while the "target" false ratio is 0.01? ( Also key cache hit ratio is around 0.001 and sstables read is in the skies ( non-exponential (non-) drop off for LCS ) but that should be filed under "effect" and not "cause"? ) [default@unknown] use KS; Authenticated to keyspace: KS [default@KS] describe CF; ColumnFamily: CF Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.BytesType GC grace seconds: 691200 Compaction min/max thresholds: 4/32 Read repair chance: 0.1 DC Local Read repair chance: 0.0 Replicate on write: true Caching: ALL Bloom Filter FP chance: 0.01 Built indexes: [] Compaction Strategy: org.apache.cassandra.db.compaction.LeveledCompactionStrategy Compaction Strategy Options: sstable_size_in_mb: 5 Compression Options: sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor Keyspace: KS Read Count: 628950 Read Latency: 93.19921121869784 ms. Write Count: 1219021 Write Latency: 0.14352380885973254 ms. Pending Tasks: 0 Column Family: CF SSTable count: 22682 Space used (live): 119771434915 Space used (total): 119771434915 Number of Keys (estimate): 203837952 Memtable Columns Count: 13125 Memtable Data Size: 33212827 Memtable Switch Count: 15 Read Count: 629009 Read Latency: 88.434 ms. Write Count: 1219038 Write Latency: 0.095 ms. Pending Tasks: 0 Bloom Filter False Positives: 37939419 Bloom Filter False Ratio: 0.97928 Bloom Filter Space Used: 261572784 Compacted row minimum size: 104 Compacted row maximum size: 263210 Compacted row mean size: 3041 I upgraded sstables after changing the FP chance Thanks! Andras <attachment.eml>