Running incremental repair puts sstables into a “repaired” set (and an 
unrepaired set), which results in something similar to what you’re describing.

Were you running / did you run incremental repair ?

-- 
Jeff Jirsa


> On Nov 13, 2017, at 5:04 AM, Nicolas Guyomar <nicolas.guyo...@gmail.com> 
> wrote:
> 
> Hi everyone,
> 
> I'm facing quite a strange behavior on STCS on 3.0.13, the strategy seems to 
> have "forgotten" about old sstables, and started a completely new cycle from 
> scratch, leaving the old sstables on disk untouched : 
> 
> Something happened on Nov 10 on every node, which resulted in all those 
> sstables left behind : 
> 
> -rw-r--r--.  8 cassandra cassandra   15G Nov  9 22:22 mc-4828-big-Data.db
> -rw-r--r--.  8 cassandra cassandra  4.8G Nov 10 01:39 mc-4955-big-Data.db
> -rw-r--r--.  8 cassandra cassandra  2.4G Nov 10 01:45 mc-4957-big-Data.db
> -rw-r--r--.  8 cassandra cassandra  662M Nov 10 01:47 mc-4959-big-Data.db
> -rw-r--r--.  8 cassandra cassandra  2.8G Nov 10 03:46 mc-5099-big-Data.db
> -rw-r--r--.  8 cassandra cassandra  4.6G Nov 10 03:58 mc-5121-big-Data.db
> -rw-r--r--.  7 cassandra cassandra   53M Nov 10 08:45 mc-5447-big-Data.db
> -rw-r--r--.  7 cassandra cassandra  219M Nov 10 08:46 mc-5454-big-Data.db
> -rw-r--r--.  7 cassandra cassandra  650M Nov 10 08:46 mc-5452-big-Data.db
> -rw-r--r--.  7 cassandra cassandra  1.2G Nov 10 08:48 mc-5458-big-Data.db
> -rw-r--r--.  7 cassandra cassandra  1.5G Nov 10 08:50 mc-5465-big-Data.db
> -rw-r--r--.  7 cassandra cassandra  504M Nov 10 09:39 mc-5526-big-Data.db
> -rw-r--r--.  7 cassandra cassandra   57M Nov 10 09:40 mc-5527-big-Data.db
> -rw-r--r--.  7 cassandra cassandra  101M Nov 10 09:41 mc-5532-big-Data.db
> -rw-r--r--.  7 cassandra cassandra   86M Nov 10 09:41 mc-5533-big-Data.db
> -rw-r--r--.  7 cassandra cassandra  134M Nov 10 09:42 mc-5537-big-Data.db
> -rw-r--r--.  7 cassandra cassandra  3.9G Nov 10 09:54 mc-5538-big-Data.db
> -rw-r--r--.  7 cassandra cassandra  1.3G Nov 10 09:57 mc-5548-big-Data.db
> -rw-r--r--.  6 cassandra cassandra   16G Nov 11 01:23 mc-6474-big-Data.db
> -rw-r--r--.  4 cassandra cassandra   17G Nov 12 06:44 mc-7898-big-Data.db
> -rw-r--r--.  3 cassandra cassandra  8.2G Nov 12 13:45 mc-8226-big-Data.db
> -rw-r--r--.  2 cassandra cassandra  6.8G Nov 12 22:38 mc-8581-big-Data.db
> -rw-r--r--.  2 cassandra cassandra  6.1G Nov 13 03:10 mc-8937-big-Data.db
> -rw-r--r--.  2 cassandra cassandra  3.1G Nov 13 04:12 mc-9019-big-Data.db
> -rw-r--r--.  2 cassandra cassandra  3.0G Nov 13 05:56 mc-9112-big-Data.db
> -rw-r--r--.  2 cassandra cassandra  1.2G Nov 13 06:14 mc-9138-big-Data.db
> -rw-r--r--.  2 cassandra cassandra  1.1G Nov 13 06:27 mc-9159-big-Data.db
> -rw-r--r--.  2 cassandra cassandra  1.2G Nov 13 06:46 mc-9182-big-Data.db
> -rw-r--r--.  1 cassandra cassandra  1.9G Nov 13 07:18 mc-9202-big-Data.db
> -rw-r--r--.  1 cassandra cassandra  353M Nov 13 07:22 mc-9207-big-Data.db
> -rw-r--r--.  1 cassandra cassandra  120M Nov 13 07:22 mc-9208-big-Data.db
> -rw-r--r--.  1 cassandra cassandra  100M Nov 13 07:23 mc-9209-big-Data.db
> -rw-r--r--.  1 cassandra cassandra   67M Nov 13 07:25 mc-9210-big-Data.db
> -rw-r--r--.  1 cassandra cassandra   51M Nov 13 07:25 mc-9211-big-Data.db
> -rw-r--r--.  1 cassandra cassandra   73M Nov 13 07:27 mc-9212-big-Data.db
> 
> 
> TRACE logs for the Compaction Manager shows that sstables before Nov 10 are 
> grouped in different buckets than the one after Nov 10.
> 
> At first I thought off some coldness behavior that would filter those "old" 
> sstables, but looking at the code 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java#L237
>  I don't see any coldness or time pattern used to create bucket.
> 
> I tried restarting the node but the buckets are still grouping in 2 "groups" 
> splitted around Nov 10
> 
> I may have missed sthg from the logs, but they are clear from error/warn at 
> that Nov 10 time
> 
> For what it's worth, restarting the node fixed nodetool status from reporting 
> a wrong Load (nearly 2TB per node instead à 300Gb) => we are loading some 
> data for a week now, it seems that this can happen sometimes
> 
> If anyone ever experienced that kind of behavior I'd be glad to know whether 
> it is OK or not, I'd like to avoid manually triggering JMX 
> UserDefinedCompaction ;) 
> 
> Thank you  
> 

Reply via email to