Hi everyone,

This is my first question, apologize may I do something wrong.

I have a small Cassandra cluster build upon 3 nodes. Originally born as
2.0.X cluster was upgraded to 2.0.15 then 2.1.13 and finally to 3.0.4
recently 3.0.6. Ubuntu is the OS.

There are few tables that have DateTieredCompactionStrategy and are
suffering of constantly growing SSTable count. I have the feeling this has
something to do with the upgrade however I need some hint on how to debug
this issue.

Tables are created like:
CREATE TABLE <table> (
 ...
PRIMARY KEY (...)
) WITH CLUSTERING ORDER BY (...)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class':
'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 7776000
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';

and this is the "nodetool cfstats" output for that table:
Read Count: 39
Read Latency: 85.03307692307692 ms.
Write Count: 9845275
Write Latency: 0.09604882382665797 ms.
Pending Flushes: 0
Table: <table>
SSTable count: 48
Space used (live): 19566109394
Space used (total): 19566109394
Space used by snapshots (total): 109796505570
Off heap memory used (total): 11317941
SSTable Compression Ratio: 0.22632301701483284
Number of keys (estimate): 2557
Memtable cell count: 0
Memtable data size: 0
Memtable off heap memory used: 0
Memtable switch count: 828
Local read count: 39
Local read latency: 93.051 ms
Local write count: 9845275
Local write latency: 0.106 ms
Pending flushes: 0
Bloom filter false positives: 2
Bloom filter false ratio: 0.00000
Bloom filter space used: 10200
Bloom filter off heap memory used: 9816
Index summary off heap memory used: 4677
Compression metadata off heap memory used: 11303448
Compacted partition minimum bytes: 150
Compacted partition maximum bytes: 4139110981
Compacted partition mean bytes: 13463937
Average live cells per slice (last five minutes): 59.69230769230769
Maximum live cells per slice (last five minutes): 149
Average tombstones per slice (last five minutes): 8.564102564102564
Maximum tombstones per slice (last five minutes): 42

According to the "nodetool compactionhistory <keyspace>.<table>"
the oldest timestamp is "Thu, 30 Jun 2016 13:14:23 GMT"
and the most recent one is "Thu, 07 Jul 2016 12:15:50 GMT" (THAT IS TODAY)

However the table count is still very high compared to tables that have a
different compaction strategy. If I run a "nodetool compact <table>" the
SSTable count decrease dramatically to a reasonable number.
I read many articles including:
http://www.datastax.com/dev/blog/datetieredcompactionstrategy however I can
not really tell if this is an expected behavior.
What concerns me is that I have an high tombstone read count despite those
are insert only tables. Compacting the table make the tombstone issue
disappear. Yes, we are using TTL to expire data after 3 months and I have
not touch the GC grace period.
Looking at the file system I see the very first *-Data.db file that is 15GB
then there are all the other 43 *-Data.db files that are ranging from 50 to
150MB in size.

How can I debug this mis-compaction issue? Any help is much appreciated
Best,

Reply via email to