Hi, I've got a strange problem, where the database on a node has inflated 10X after running repair. This is not the result of receiving missed data.
I didn't perform repair within my usual 10 day cycle, so followed recommended practice: http://wiki.apache.org/cassandra/Operations#Dealing_with_the_consequences_of_nodetool_repair_not_running_within_GCGraceSeconds The sequence of events was like this: 1) set GCGraceSeconds to some huge value 2) perform rolling upgrade from 0.7.4 to 0.7.6-2 3) run nodetool repair on the first node in cluster ~10pm. It has a ~30G database 3) 2.30am decide to leave it running all night and wake up 9am to find still running 4) late morning investigation shows that db size has increased to 370G. The snapshot folder accounts for only 30G 5) node starts to run out of disk space http://pastebin.com/Sm0B7nfR 6) decide to bail! Reset GCGraceSeconds to 864000 and restart node to stop repair 7) as node restarts it deletes a bunch of tmp files, reducing db size from 370G to 270G 8) node now constantly performing minor compactions and du rising slightly then falling by a greater amount after minor compaction deletes sstable 9) gradually disk usage is coming down. Currently at 254G (3pm) 10) performance of node obviously not great! Investigation of the database reveals the main problem to have occurred in a single column family, UserFights. This contains millions of fight records from our MMO, but actually exactly the same number as the MonsterFights cf. However, the comparative size is Column Family: MonsterFights SSTable count: 38 Space used (live): 13867454647 Space used (total): 13867454647 (13G) Memtable Columns Count: 516 Memtable Data Size: 598770 Memtable Switch Count: 4 Read Count: 514 Read Latency: 157.649 ms. Write Count: 4059 Write Latency: 0.025 ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 183004 Key cache hit rate: 0.0023566218452145135 Row cache: disabled Compacted row minimum size: 771 Compacted row maximum size: 943127 Compacted row mean size: 3208 Column Family: UserFights SSTable count: 549 Space used (live): 185355019679 Space used (total): 219489031691 (219G) Memtable Columns Count: 483 Memtable Data Size: 560569 Memtable Switch Count: 8 Read Count: 2159 Read Latency: 2589.150 ms. Write Count: 4080 Write Latency: 0.018 ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 200000 Key cache hit rate: 0.03357770764288416 Row cache: disabled Compacted row minimum size: 925 Compacted row maximum size: 12108970 Compacted row mean size: 503069 These stats were taken at 3pm, and at 1pm UserFights was using 224G total, so overall size is gradually coming down. Another observation is the following appearing in the logs during the minor compactions: Compacting large row 536c69636b5061756c (121235810 bytes) incrementally The largest number of fights any user has performed on our MMO that I can find is short of 10,000. Each fight record is smaller than 1K... so it looks like these rows have grown +10X somehow. The size of UserFights on another replica node, which actually has a slightly higher proportion of ring is Column Family: UserFights SSTable count: 14 Space used (live): 17844982744 Space used (total): 17936528583 (18G) Memtable Columns Count: 767 Memtable Data Size: 891153 Memtable Switch Count: 6 Read Count: 2298 Read Latency: 61.020 ms. Write Count: 4261 Write Latency: 0.104 ms. Pending Tasks: 0 Key cache capacity: 200000 Key cache size: 55172 Key cache hit rate: 0.8079570484581498 Row cache: disabled Compacted row minimum size: 925 Compacted row maximum size: 12108970 Compacted row mean size: 846477 ... All ideas and suggestions greatly appreciated as always! Dominic ria101.wordpress.com