I just want to chime in and say that we also had issues keeping up with compaction once (with vnodes/ssd disks) and I also want to recommend keeping track of your open file limit which might bite you.
Cheers, Jens On Friday, August 19, 2016, Mark Rose <[email protected]> wrote: > Hi Ezra, > > Are you making frequent changes to your rows (including TTL'ed > values), or mostly inserting new ones? If you're only inserting new > data, it's probable using size-tiered compaction would work better for > you. If you are TTL'ing whole rows, consider date-tiered. > > If leveled compaction is still the best strategy, one way to catch up > with compactions is to have less data per partition -- in other words, > use more machines. Leveled compaction is CPU expensive. You are CPU > bottlenecked currently, or from the other perspective, you have too > much data per node for leveled compaction. > > At this point, compaction is so far behind that you'll likely be > getting high latency if you're reading old rows (since dozens to > hundreds of uncompacted sstables will likely need to be checked for > matching rows). You may be better off with size tiered compaction, > even if it will mean always reading several sstables per read (higher > latency than when leveled can keep up). > > How much data do you have per node? Do you update/insert to/delete > rows? Do you TTL? > > Cheers, > Mark > > On Wed, Aug 17, 2016 at 2:39 PM, Ezra Stuetzel <[email protected] > <javascript:;>> wrote: > > I have one node in my cluster 2.2.7 (just upgraded from 2.2.6 hoping to > fix > > issue) which seems to be stuck in a weird state -- with a large number of > > pending compactions and sstables. The node is compacting about 500gb/day, > > number of pending compactions is going up at about 50/day. It is at about > > 2300 pending compactions now. I have tried increasing number of > compaction > > threads and the compaction throughput, which doesn't seem to help > eliminate > > the many pending compactions. > > > > I have tried running 'nodetool cleanup' and 'nodetool compact'. The > latter > > has fixed the issue in the past, but most recently I was getting OOM > errors, > > probably due to the large number of sstables. I upgraded to 2.2.7 and am > no > > longer getting OOM errors, but also it does not resolve the issue. I do > see > > this message in the logs: > > > >> INFO [RMI TCP Connection(611)-10.9.2.218] 2016-08-17 01:50:01,985 > >> CompactionManager.java:610 - Cannot perform a full major compaction as > >> repaired and unrepaired sstables cannot be compacted together. These > two set > >> of sstables will be compacted separately. > > > > Below are the 'nodetool tablestats' comparing a normal and the > problematic > > node. You can see problematic node has many many more sstables, and they > are > > all in level 1. What is the best way to fix this? Can I just delete those > > sstables somehow then run a repair? > >> > >> Normal node > >>> > >>> keyspace: mykeyspace > >>> > >>> Read Count: 0 > >>> > >>> Read Latency: NaN ms. > >>> > >>> Write Count: 31905656 > >>> > >>> Write Latency: 0.051713177939359714 ms. > >>> > >>> Pending Flushes: 0 > >>> > >>> Table: mytable > >>> > >>> SSTable count: 1908 > >>> > >>> SSTables in each level: [11/4, 20/10, 213/100, 1356/1000, 306, > 0, > >>> 0, 0, 0] > >>> > >>> Space used (live): 301894591442 > >>> > >>> Space used (total): 301894591442 > >>> > >>> > >>> > >>> Problematic node > >>> > >>> Keyspace: mykeyspace > >>> > >>> Read Count: 0 > >>> > >>> Read Latency: NaN ms. > >>> > >>> Write Count: 30520190 > >>> > >>> Write Latency: 0.05171286705620116 ms. > >>> > >>> Pending Flushes: 0 > >>> > >>> Table: mytable > >>> > >>> SSTable count: 14105 > >>> > >>> SSTables in each level: [13039/4, 21/10, 206/100, 831, 0, 0, 0, > >>> 0, 0] > >>> > >>> Space used (live): 561143255289 > >>> > >>> Space used (total): 561143255289 > > > > Thanks, > > > > Ezra > -- Jens Rantil Backend engineer Tink AB Email: [email protected] Phone: +46 708 84 18 32 Web: www.tink.se Facebook <https://www.facebook.com/#!/tink.se> Linkedin <http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary> Twitter <https://twitter.com/tink>
