This looks suspicious > SSTables in each level: [1, 3, 101/100, 1022/1000, 10587/10000, 1750] It says there are 6 levels in the levelled DB, which may explain why the number of SSTables per read is so high. It also says some of the levels have more files than they should, check nodetool compactionstats to see if compaction can keep up.
> Offset SSTables Write Latency Read Latency Row Size > Column Count > 1 574961651 0 0 0 > 67267 > 2 50623075 0 0 0 > 7 > 3 6977136 0 0 0 > 866112674 > 4 1068933 151 0 0 > 5 > 5 100458 1752 0 0 > 0 > 6 2299 8845 0 0 > 0 > 7 25 36376 0 0 > 0 When we would expect most reads to be served from 2 sstables when using LCS http://www.datastax.com/dev/blog/when-to-use-leveled-compaction > Space used (live): 106151372078 You don't have a huge amount of data. > Read Count: 754323272 ... > Bloom Filter False Positives: 22785986 > Bloom Filter False Ratio: 0.24048 > Bloom Filter Space Used: 659403424 The False Ratio there is since it was last read. But it's still high. However remember the numbers above say there are a lot of sstables. I would check nodetool compactionstats to see if compaction can keep up. You've changed the sstable_size_mb i'd also wait to see what it looks like when that change has gone through. I'd like to see if the sstable count goes down. Hope that helps. ----------------- Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 27/07/2013, at 3:25 PM, Keith Wright <kwri...@nanigans.com> wrote: > Also here is the output from cfhistograms if that helps. Thanks all! > > users/cookie_user_lookup histograms > Offset SSTables Write Latency Read Latency Row Size > Column Count > 1 574961651 0 0 0 > 67267 > 2 50623075 0 0 0 > 7 > 3 6977136 0 0 0 > 866112674 > 4 1068933 151 0 0 > 5 > 5 100458 1752 0 0 > 0 > 6 2299 8845 0 0 > 0 > 7 25 36376 0 0 > 0 > 8 0 118417 1 0 > 0 > 10 0 970897 80 0 > 0 > 12 0 3154897 706 0 > 0 > 14 0 6681561 5633 0 > 0 > 17 0 17522902 62716 0 > 0 > 20 0 23857810 323821 0 > 0 > 24 0 30173547 1756172 0 > 0 > 29 0 22315480 7493321 0 > 0 > 35 0 9574788 20574499 0 > 0 > 42 0 2816872 23382953 15 > 0 > 50 0 764169 15332694 16681 > 0 > 60 0 245998 10826754 50571 > 0 > 72 0 112974 13640480 0 > 0 > 86 0 76213 12988932 0 > 0 > 103 0 68305 12774730 4 > 0 > 124 0 45879 14726045 0 > 0 > 149 0 34684 14306264 474654029 > 0 > 179 0 24458 8489834 391359834 > 0 > 215 0 21349 3616683 98819 > 0 > 258 0 12373 1471382 0 > 0 > 310 0 8138 2096660 0 > 0 > 372 0 6420 12175498 0 > 0 > 446 0 4981 44092631 0 > 0 > 535 0 4052 19968489 0 > 0 > 642 0 3514 30524161 0 > 0 > 770 0 3411 88732015 0 > 0 > 924 0 3379 36941969 0 > 0 > 1109 0 3259 88311470 0 > 0 > 1331 0 2925 34015836 0 > 0 > 1597 0 2522 25807588 0 > 0 > 1916 0 1909 24346980 0 > 0 > 2299 0 1083 28822035 0 > 0 > 2759 0 646 12627664 0 > 0 > 3311 0 489 9374038 0 > 0 > 3973 0 331 5114594 0 > 0 > 4768 0 797 3119917 0 > 0 > 5722 0 317 1805150 0 > 0 > 6866 0 94 1105472 0 > 0 > 8239 0 68 701533 0 > 0 > 9887 0 28 459354 0 > 0 > 11864 0 16 304326 0 > 0 > 14237 0 20 207982 0 > 0 > 17084 0 66 153394 0 > 0 > 20501 0 422 134500 0 > 0 > 24601 0 716 138258 0 > 0 > 29521 0 722 128058 0 > 0 > 35425 0 711 123178 0 > 0 > 42510 0 663 116942 0 > 0 > 51012 0 495 107036 0 > 0 > 61214 0 465 109102 0 > 0 > 73457 0 370 108586 0 > 0 > 88148 0 257 77200 0 > 0 > 105778 0 177 47428 0 > 0 > 126934 0 59 22524 0 > 0 > 152321 0 37 12021 0 > 0 > 182785 0 17 7531 0 > 0 > 219342 0 28 5127 0 > 0 > 263210 0 12 3554 0 > 0 > 315852 0 12 2101 0 > 0 > 379022 0 4 1417 0 > 0 > 454826 0 4 942 0 > 0 > 545791 0 1 539 0 > 0 > 654949 0 0 341 0 > 0 > 785939 0 0 253 0 > 0 > 943127 0 0 273 0 > 0 > 1131752 0 8 607 0 > 0 > 1358102 0 22 2954 0 > 0 > 1629722 0 0 493 0 > 0 > 1955666 0 0 56 0 > 0 > 2346799 0 0 81 0 > 0 > 2816159 0 0 13 0 > 0 > 3379391 0 0 0 0 > 0 > 4055269 0 0 0 0 > 0 > 4866323 0 0 0 0 > 0 > 5839588 0 0 3 0 > 0 > 7007506 0 0 0 0 > 0 > 8409007 0 0 0 0 > 0 > 10090808 0 0 0 0 > 0 > 12108970 0 0 0 0 > 0 > 14530764 0 0 0 0 > 0 > 17436917 0 0 0 0 > 0 > 20924300 0 0 2 0 > 0 > 25109160 0 0 1 0 > 0 > > From: Keith Wright <kwri...@nanigans.com> > Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> > Date: Friday, July 26, 2013 11:10 PM > To: "user@cassandra.apache.org" <user@cassandra.apache.org> > Subject: key cache hit rate and BF false positive > > Hi all, > > I am experiencing VERY poor key cache hit rate on my 6 node C* 1.2.4 with > Vnode cluster. I am using CQL3 with LCS and yesterday increased my SSTable > size from default 5 MB to 256 MB although I did not force a major compaction > and am instead letting the new size take effect organically as compactions > are normally triggered. The issue for this email is that my key cache hit > rate is TERRIBLE at < 1% and my bloom filter false positive rate is around > 0.2 (see cfstats , info, and create for example table). I am using the OOTB > bloom filter settings and only 300 Mbs of key cache. I'm guessing that I > need to increase my bloom_filter_fp_chance from default 0.1 to 0.01 (trigger > upgradesstables for it to take effect) and increase my key cache size (my > nodes have 32 GB of RAM and I'm running an 8 GB heap on java 7) but I > thought I would get everyone's opinion first. My read latency is good but > I'm afraid I'll max out the TPS on my SSD drives which are currently hovering > around 40% utilization. > > Should I wait to see what effect the sstable size change has first as it will > obviously decrease the number of sstables? > What do you recommend for bloom filter and key cache values? > > I tried upping the key cache size to 600 MB but did not see a sizable > difference in hit rate. I am also planning on trying out row cache but > wanted to hold off until I got these values handled first. FYI – each node > has 3 x 800 GB SSD in JBOD running RF 3. > > Thanks! > > Column Family: cookie_user_lookup > SSTable count: 13464 > SSTables in each level: [1, 3, 101/100, 1022/1000, 10587/10000, 1750] > Space used (live): 106151372078 > Space used (total): 106870515200 > Number of Keys (estimate): 954198528 > Memtable Columns Count: 369605 > Memtable Data Size: 170078697 > Memtable Switch Count: 668 > Read Count: 754323272 > Read Latency: 1.029 ms. > Write Count: 130312827 > Write Latency: 0.023 ms. > Pending Tasks: 0 > Bloom Filter False Positives: 22785986 > Bloom Filter False Ratio: 0.24048 > Bloom Filter Space Used: 659403424 > Compacted row minimum size: 36 > Compacted row maximum size: 215 > Compacted row mean size: 162 > > Token : (invoke with -T/--tokens to see all 256 tokens) > ID : 6e1a0ac1-6354-40e1-bba4-063c7bb0983d > Gossip active : true > Thrift active : true > Load : 204.64 GB > Generation No : 1374259760 > Uptime (seconds) : 634791 > Heap Memory (MB) : 5601.37 / 8072.00 > Data Center : DAL1 > Rack : RAC1 > Exceptions : 78 > Key Cache : size 314572552 (bytes), capacity 314572800 (bytes), > 327775491 hits, 3259949221 requests, 0.162 recent hit rate, 14400 save period > in seconds > Row Cache : size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, > NaN recent hit rate, 0 save period in seconds > > CREATE TABLE cookie_user_lookup ( > cookie text PRIMARY KEY, > creation_time timestamp, > opt_out boolean, > status int, > user_id timeuuid > ) WITH > bloom_filter_fp_chance=0.100000 AND > caching='KEYS_ONLY' AND > comment='' AND > dclocal_read_repair_chance=0.000000 AND > gc_grace_seconds=86400 AND > read_repair_chance=0.100000 AND > replicate_on_write='true' AND > populate_io_cache_on_flush='false' AND > compaction={'sstable_size_in_mb': '256', 'class': > 'LeveledCompactionStrategy'} AND > compression={'chunk_length_kb': '8', 'crc_check_chance': '0.1', > 'sstable_compression': 'LZ4Compressor'};