Thank you for the response. Compactionstats does not indicate that we are running behind (see below). FYI - since making the change from default 5 to 256 MB I have been seeing increased GC pauses (one node got locked in GC spirals last night and had to be restarted) so I actually decreased the size down to 64 to see if that makes a difference. Here are the latest stats on a table that has more frequent row updates (and thus more likely to compact old SSTables into the new size). Any insight is VERY much appreciated.
Thanks Column Family: global_user SSTable count: 10347 SSTables in each level: [2, 10, 63, 1050/1000, 9222, 0, 0] Space used (live): 70301335303 Space used (total): 71599083983 Number of Keys (estimate): 478791040 Memtable Columns Count: 24516 Memtable Data Size: 15901376 Memtable Switch Count: 3706 Read Count: 1103830636 Read Latency: 2.638 ms. Write Count: 827824859 Write Latency: 0.059 ms. Pending Tasks: 0 Bloom Filter False Positives: 12935974 Bloom Filter False Ratio: 0.12546 Bloom Filter Space Used: 329145520 Compacted row minimum size: 73 Compacted row maximum size: 4866323 Compacted row mean size: 335 [kwright@lxpcas001 ~]$ nodetool info Token : (invoke with -T/--tokens to see all 256 tokens) ID : 882f4831-35e9-4c2f-be6b-81b4c2343ca1 Gossip active : true Thrift active : true Load : 211.3 GB Generation No : 1374259506 Uptime (seconds) : 1014090 Heap Memory (MB) : 5627.01 / 8072.00 Data Center : DAL1 Rack : RAC1 Exceptions : 185 Key Cache : size 104857072 (bytes), capacity 104857600 (bytes), 731222922 hits, 6607308542 requests, 0.143 recent hit rate, 14400 save period in seconds Row Cache : size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds [kwright@lxpcas001 ~]$ nodetool compactionstats pending tasks: 0 Active compaction remaining time : n/a [kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user users/global_user histograms Offset SSTables Write Latency Read Latency Row Size Column Count 1 39997928 0 0 0 4552 2 14732779 0 0 0 112784201 3 12325618 0 0 0 252571336 4 6894542 0 0 0 24965369 5 1970175 108 0 0 13853402 6 156263 1196 0 0 23374848 7 18854 5365 0 0 5974743 8 1442 17396 0 0 7510167 10 204 137019 0 0 11223392 12 0 489477 0 0 6918122 14 0 1145914 0 0 4969766 17 0 3620828 3 0 4566327 20 0 6432030 14 0 2545871 24 0 11386809 54 0 2049213 29 0 13823989 431 0 1399119 35 0 11539190 2721 0 949174 42 0 6457952 11665 0 635466 50 0 2933479 65010 0 428200 60 0 1706411 115617 0 327847 72 0 1060158 135740 0 243561 86 0 580078 198312 634 180901 103 0 342643 240777 3912 142239 124 0 182482 450592 5 114299 149 0 95731 606666 7544545 87948 179 0 51167 640083 103528698 68726 215 0 38468 537811 145727072 53921 258 0 26060 413670 111347054 41730 310 0 16745 380406 18849783 32343 372 0 12705 976548 22350396 24799 446 0 9312 3150544 19059338 18536 535 0 6797 2532003 12349361 14133 642 0 4724 3204331 10323731 10604 770 0 3287 8156762 7357706 7684 924 0 2216 5074100 6625384 5771 1109 0 1660 7382161 3848898 4112 1331 0 1291 5776629 2664557 2915 1597 0 1099 5901952 1855835 2057 1916 0 941 6044518 1258502 1490 2299 0 778 6294312 863014 1005 2759 0 675 4835393 612338 689 3311 0 592 3939427 444323 478 3973 0 455 2848316 330928 360 4768 0 493 1958814 255025 232 5722 0 969 1274543 198373 183 6866 0 440 819421 156189 100 8239 0 204 532863 123229 79 9887 0 130 357274 98316 40 11864 0 58 252345 77067 35 14237 0 19 186547 61019 19 17084 0 16 149989 47782 19 20501 0 38 126544 37107 6 24601 0 681 96833 28891 4 29521 0 692 81161 21909 3 35425 0 621 61802 16498 2 42510 0 794 52684 12506 1 51012 0 1040 47671 9183 1 61214 0 836 39531 6728 1 73457 0 745 35585 4824 0 88148 0 790 35345 3490 0 105778 0 528 26822 2454 0 126934 0 223 14407 1738 0 152321 0 57 7401 1197 0 182785 0 58 5015 808 0 219342 0 31 3258 578 0 263210 0 37 1993 376 0 315852 0 31 1191 280 0 379022 0 6 651 203 0 454826 0 2 683 119 0 545791 0 15 181 87 0 654949 0 0 118 59 0 785939 0 1 201 34 0 943127 0 0 226 22 0 1131752 0 109 1058 19 0 1358102 0 203 8539 6 0 1629722 0 11 522 4 0 1955666 0 0 11 2 0 2346799 0 0 0 1 0 2816159 0 0 0 2 0 3379391 0 0 0 1 0 4055269 0 0 0 0 0 4866323 0 0 0 1 0 5839588 0 0 0 0 0 7007506 0 0 0 0 0 8409007 0 0 0 0 0 10090808 0 0 0 0 0 12108970 0 0 0 0 0 14530764 0 0 0 0 0 17436917 0 0 0 0 0 20924300 0 0 0 0 0 25109160 0 0 6 0 0 On 7/31/13 5:25 AM, "aaron morton" <aa...@thelastpickle.com> wrote: >This looks suspicious > >> SSTables in each level: [1, 3, 101/100, 1022/1000, 10587/10000, 1750] >It says there are 6 levels in the levelled DB, which may explain why the >number of SSTables per read is so high. >It also says some of the levels have more files than they should, check >nodetool compactionstats to see if compaction can keep up. > >> Offset SSTables Write Latency Read Latency Row >>Size Column Count >> 1 574961651 0 0 >> 0 67267 >> 2 50623075 0 0 >> 0 7 >> 3 6977136 0 0 >> 0 866112674 >> 4 1068933 151 0 >> 0 5 >> 5 100458 1752 0 >> 0 0 >> 6 2299 8845 0 >> 0 0 >> 7 25 36376 0 >> 0 0 >When we would expect most reads to be served from 2 sstables when using >LCS http://www.datastax.com/dev/blog/when-to-use-leveled-compaction > >> Space used (live): 106151372078 >You don't have a huge amount of data. > >> Read Count: 754323272 >... >> Bloom Filter False Positives: 22785986 >> Bloom Filter False Ratio: 0.24048 >> Bloom Filter Space Used: 659403424 >The False Ratio there is since it was last read. But it's still high. >However remember the numbers above say there are a lot of sstables. > >I would check nodetool compactionstats to see if compaction can keep up. >You've changed the sstable_size_mb i'd also wait to see what it looks >like when that change has gone through. I'd like to see if the sstable >count goes down. > >Hope that helps. > >----------------- >Aaron Morton >Cassandra Consultant >New Zealand > >@aaronmorton >http://www.thelastpickle.com > >On 27/07/2013, at 3:25 PM, Keith Wright <kwri...@nanigans.com> wrote: > >> Also here is the output from cfhistograms if that helps. Thanks all! >> >> users/cookie_user_lookup histograms >> Offset SSTables Write Latency Read Latency Row >>Size Column Count >> 1 574961651 0 0 >> 0 67267 >> 2 50623075 0 0 >> 0 7 >> 3 6977136 0 0 >> 0 866112674 >> 4 1068933 151 0 >> 0 5 >> 5 100458 1752 0 >> 0 0 >> 6 2299 8845 0 >> 0 0 >> 7 25 36376 0 >> 0 0 >> 8 0 118417 1 >> 0 0 >> 10 0 970897 80 >> 0 0 >> 12 0 3154897 706 >> 0 0 >> 14 0 6681561 5633 >> 0 0 >> 17 0 17522902 62716 >> 0 0 >> 20 0 23857810 323821 >> 0 0 >> 24 0 30173547 1756172 >> 0 0 >> 29 0 22315480 7493321 >> 0 0 >> 35 0 9574788 20574499 >> 0 0 >> 42 0 2816872 23382953 >>15 0 >> 50 0 764169 15332694 >>16681 0 >> 60 0 245998 10826754 >>50571 0 >> 72 0 112974 13640480 >> 0 0 >> 86 0 76213 12988932 >> 0 0 >> 103 0 68305 12774730 >> 4 0 >> 124 0 45879 14726045 >> 0 0 >> 149 0 34684 14306264 >>474654029 0 >> 179 0 24458 8489834 >>391359834 0 >> 215 0 21349 3616683 >>98819 0 >> 258 0 12373 1471382 >> 0 0 >> 310 0 8138 2096660 >> 0 0 >> 372 0 6420 12175498 >> 0 0 >> 446 0 4981 44092631 >> 0 0 >> 535 0 4052 19968489 >> 0 0 >> 642 0 3514 30524161 >> 0 0 >> 770 0 3411 88732015 >> 0 0 >> 924 0 3379 36941969 >> 0 0 >> 1109 0 3259 88311470 >> 0 0 >> 1331 0 2925 34015836 >> 0 0 >> 1597 0 2522 25807588 >> 0 0 >> 1916 0 1909 24346980 >> 0 0 >> 2299 0 1083 28822035 >> 0 0 >> 2759 0 646 12627664 >> 0 0 >> 3311 0 489 9374038 >> 0 0 >> 3973 0 331 5114594 >> 0 0 >> 4768 0 797 3119917 >> 0 0 >> 5722 0 317 1805150 >> 0 0 >> 6866 0 94 1105472 >> 0 0 >> 8239 0 68 701533 >> 0 0 >> 9887 0 28 459354 >> 0 0 >> 11864 0 16 304326 >> 0 0 >> 14237 0 20 207982 >> 0 0 >> 17084 0 66 153394 >> 0 0 >> 20501 0 422 134500 >> 0 0 >> 24601 0 716 138258 >> 0 0 >> 29521 0 722 128058 >> 0 0 >> 35425 0 711 123178 >> 0 0 >> 42510 0 663 116942 >> 0 0 >> 51012 0 495 107036 >> 0 0 >> 61214 0 465 109102 >> 0 0 >> 73457 0 370 108586 >> 0 0 >> 88148 0 257 77200 >> 0 0 >> 105778 0 177 47428 >> 0 0 >> 126934 0 59 22524 >> 0 0 >> 152321 0 37 12021 >> 0 0 >> 182785 0 17 7531 >> 0 0 >> 219342 0 28 5127 >> 0 0 >> 263210 0 12 3554 >> 0 0 >> 315852 0 12 2101 >> 0 0 >> 379022 0 4 1417 >> 0 0 >> 454826 0 4 942 >> 0 0 >> 545791 0 1 539 >> 0 0 >> 654949 0 0 341 >> 0 0 >> 785939 0 0 253 >> 0 0 >> 943127 0 0 273 >> 0 0 >> 1131752 0 8 607 >> 0 0 >> 1358102 0 22 2954 >> 0 0 >> 1629722 0 0 493 >> 0 0 >> 1955666 0 0 56 >> 0 0 >> 2346799 0 0 81 >> 0 0 >> 2816159 0 0 13 >> 0 0 >> 3379391 0 0 0 >> 0 0 >> 4055269 0 0 0 >> 0 0 >> 4866323 0 0 0 >> 0 0 >> 5839588 0 0 3 >> 0 0 >> 7007506 0 0 0 >> 0 0 >> 8409007 0 0 0 >> 0 0 >> 10090808 0 0 0 >> 0 0 >> 12108970 0 0 0 >> 0 0 >> 14530764 0 0 0 >> 0 0 >> 17436917 0 0 0 >> 0 0 >> 20924300 0 0 2 >> 0 0 >> 25109160 0 0 1 >> 0 0 >> >> From: Keith Wright <kwri...@nanigans.com> >> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> >> Date: Friday, July 26, 2013 11:10 PM >> To: "user@cassandra.apache.org" <user@cassandra.apache.org> >> Subject: key cache hit rate and BF false positive >> >> Hi all, >> >> I am experiencing VERY poor key cache hit rate on my 6 node C* >>1.2.4 with Vnode cluster. I am using CQL3 with LCS and yesterday >>increased my SSTable size from default 5 MB to 256 MB although I did not >>force a major compaction and am instead letting the new size take effect >>organically as compactions are normally triggered. The issue for this >>email is that my key cache hit rate is TERRIBLE at < 1% and my bloom >>filter false positive rate is around 0.2 (see cfstats , info, and create >>for example table). I am using the OOTB bloom filter settings and only >>300 Mbs of key cache. I'm guessing that I need to increase my >>bloom_filter_fp_chance from default 0.1 to 0.01 (trigger upgradesstables >>for it to take effect) and increase my key cache size (my nodes have 32 >>GB of RAM and I'm running an 8 GB heap on java 7) but I thought I would >>get everyone's opinion first. My read latency is good but I'm afraid >>I'll max out the TPS on my SSD drives which are currently hovering >>around 40% utilization. >> >> Should I wait to see what effect the sstable size change has first as >>it will obviously decrease the number of sstables? >> What do you recommend for bloom filter and key cache values? >> >> I tried upping the key cache size to 600 MB but did not see a sizable >>difference in hit rate. I am also planning on trying out row cache but >>wanted to hold off until I got these values handled first. FYI each >>node has 3 x 800 GB SSD in JBOD running RF 3. >> >> Thanks! >> >> Column Family: cookie_user_lookup >> SSTable count: 13464 >> SSTables in each level: [1, 3, 101/100, 1022/1000, 10587/10000, 1750] >> Space used (live): 106151372078 >> Space used (total): 106870515200 >> Number of Keys (estimate): 954198528 >> Memtable Columns Count: 369605 >> Memtable Data Size: 170078697 >> Memtable Switch Count: 668 >> Read Count: 754323272 >> Read Latency: 1.029 ms. >> Write Count: 130312827 >> Write Latency: 0.023 ms. >> Pending Tasks: 0 >> Bloom Filter False Positives: 22785986 >> Bloom Filter False Ratio: 0.24048 >> Bloom Filter Space Used: 659403424 >> Compacted row minimum size: 36 >> Compacted row maximum size: 215 >> Compacted row mean size: 162 >> >> Token : (invoke with -T/--tokens to see all 256 tokens) >> ID : 6e1a0ac1-6354-40e1-bba4-063c7bb0983d >> Gossip active : true >> Thrift active : true >> Load : 204.64 GB >> Generation No : 1374259760 >> Uptime (seconds) : 634791 >> Heap Memory (MB) : 5601.37 / 8072.00 >> Data Center : DAL1 >> Rack : RAC1 >> Exceptions : 78 >> Key Cache : size 314572552 (bytes), capacity 314572800 (bytes), >>327775491 hits, 3259949221 requests, 0.162 recent hit rate, 14400 save >>period in seconds >> Row Cache : size 0 (bytes), capacity 0 (bytes), 0 hits, 0 >>requests, NaN recent hit rate, 0 save period in seconds >> >> CREATE TABLE cookie_user_lookup ( >> cookie text PRIMARY KEY, >> creation_time timestamp, >> opt_out boolean, >> status int, >> user_id timeuuid >> ) WITH >> bloom_filter_fp_chance=0.100000 AND >> caching='KEYS_ONLY' AND >> comment='' AND >> dclocal_read_repair_chance=0.000000 AND >> gc_grace_seconds=86400 AND >> read_repair_chance=0.100000 AND >> replicate_on_write='true' AND >> populate_io_cache_on_flush='false' AND >> compaction={'sstable_size_in_mb': '256', 'class': >>'LeveledCompactionStrategy'} AND >> compression={'chunk_length_kb': '8', 'crc_check_chance': '0.1', >>'sstable_compression': 'LZ4Compressor'}; >