After 15 hours, you have "112000 SSTables over all nodes for all CF’s” - 
assuming I’m parsing that correctly, that’s about 30 per table (100 KS * 10 
tables * 4 nodes), which is actually not unreasonable with LCS.

Your symptom is heavy GC and you’re running 2.1.6 and don’t want to upgrade 
unless there’s a bug that fixes it. There was a memory leak 
https://issues.apache.org/jira/browse/CASSANDRA-9549 – fixed in 2.1.7. Most 
users who saw this saw it using DTCS rather than LCS; I’m unsure if it could be 
the cause of your pain, but if I were in your position, I’d be upgrading to 
find out.

- Jeff

From:  "Walsh, Stephen"
Reply-To:  "user@cassandra.apache.org"
Date:  Friday, October 30, 2015 at 10:20 AM
To:  "user@cassandra.apache.org"
Subject:  SSTables are not getting removed

Hey all,

 

First off, thank you for taking the time to read this.

 

-------------------

SYSTEM SPEC

-------------------

 

We’re using Cassandra Version 2.1.6 (please don’t ask us to upgrade just yet 
less you are aware of an existing bug for this issue)

We are running on AWS 4 core . 16 GB server

We are running a 4 core Cluster

We are writing about 1220 rows per second

We are querying about 3400 rows per second

There are 100 Key spaces each with 10 CF’s

Our Replication factor is 3

All our data has a TTL of 10 seconds

Our “gc_grace_seconds” is 0

Durable writes is disabled

Read & write Consistency is ONE

Our HEAP size is 8GB (rest of the heap is default)

Our Compaction Strategy is Level Tiered

-XX:CMSInitiatingOccupancyFraction=50

 

-------------------

Cassandra.yaml

-------------------

Below are the values we’ve changed 

 

hinted_handoff_enabled: false

data_file_directories:

    - /var/lib/cassandra/data

commitlog_directory: /cassandra

saved_caches_directory: /var/lib/cassandra/saved_caches

memtable_offheap_space_in_mb: 4096

memtable_cleanup_threshold: 0.99

memtable_allocation_type: offheap_objects

memtable_flush_writers: 4

thrift_framed_transport_size_in_mb: 150

concurrent_compactors: 4

compaction_throughput_mb_per_sec: 0

endpoint_snitch: GossipingPropertyFileSnitch

 

 

-------------------

Issue

-------------------

Over time all our CF’s develop about 1 SSTable per 30 min.

Each SSTable contains only 36 rows with data that is all “Marked for deletion”

Over about 15 hours we could have up to 112000 SSTables over all nodes for all 
CF’s

This grinds our system to a halt and causes a major GC nearly every second.

 

So far the only way to get around this is to run a cron job every hour that 
does a “nodetool compact”.

 

Is there any reason that these SSTables are not being deleted during normal 
compaction  / minor GC / major GC ?

 

Regards

Stephen Walsh

 

 

 

 
This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments. 

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to