> Now, a full stop of the application was what I was seeing extensively before > (100-200 times over the course of a major compaction as reported by > gossipers on other nodes). I have also just noticed that the previous > instability (ie application stops) correlated with the compaction of a few > column families characterized by fairly fat rows (10 mb mean size, max sizes > 150-200 mb, up to a million+ columns per row). My theory is that each row > being compacted with the old settings was being promoted to the old > generation, thereby running the heap out of space and causing a stop the > world gc. With the new settings, rows being compacted typically remain in > the young generation, allowing them to be cleaned up more quickly with less > effort on the part of the garbage collector. Does this theory sound > reasonable?
Sounds reasonable I think. In addition to sizing the young gen, decreasing: in_memory_compaction_limit_in_mb: 64 from the default of 64 might help here I suppose. -- / Peter Schuller
