Cassandra and massive TTL expirations cause HEAP issue

Nils Pommerien Tue, 26 Jun 2012 07:38:48 -0700

Hello,
I am evaluating Cassandra in a log retrieval application.  My ring conists of3 
m2.xlarge instances (17.1 GB memory, 6.5 ECU (2 virtual cores with 3.25 EC2 
Compute Units each), 420 GB of local instance storage, 64-bit platform) and I 
am writing at roughly 220 writes/sec.  Per day I am adding roughly 60GB of 
data.  All of this sounds simple and easy and all three nodes are humming along 
with basically no load.


The issue is that I am writing all my data with a TTL of 10 days.  After 10 
days my cluster crashes due to a java.lang.OutOfMemoryError during compaction 
of the big column family that contains roughly 95% of the data.  So basically 
after 10 days my data set is 600GB and after 10 days Cassandra would have to 
tombstone and purge 60GB of data at the same rate of roughly 220 
deletes/second.  I am not sure if Cassandra should be able to do it, whether I 
should take a partitioning approach (one CF per day), or if there is simply 
some tweaks I need to make in the yaml file.  I have tried:

 1.  Decrease flush-largest-memtables-at to .4
 2.  reduce_cache_sizes_at and reduce_cache_capacity_to set to 1

Now, the issue remains the same:

WARN [ScheduledTasks:1] 2012-06-11 19:39:42,017 GCInspector.java (line 145) 
Heap is 0.9920103380107628 full.  You may need to reduce memtable and/or cache 
sizes.  Cassandra will now flush up to the two largest memtables to free up 
memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you 
don't want Cassandra to do this automatically.

Eventually it will just die with this message.  This affects all nodes in the 
cluster, not just one.

Dump file is incomplete: file size limit
ERROR 19:39:39,695 Exception in thread Thread[ReadStage:134,5,main]
java.lang.OutOfMemoryError: Java heap space
ERROR 19:39:39,724 Exception in thread Thread[MutationStage:57,5,main]
java.lang.OutOfMemoryError: Java heap space
      at 
org.apache.cassandra.utils.FBUtilities.hashToBigInteger(FBUtilities.java:213)
      at 
org.apache.cassandra.dht.RandomPartitioner.getToken(RandomPartitioner.java:154)
      at 
org.apache.cassandra.dht.RandomPartitioner.decorateKey(RandomPartitioner.java:47)
      at org.apache.cassandra.db.RowPosition.forKey(RowPosition.java:54)

Any help is highly appreciated.  It would be cool to tweak it in a way that I 
can have a moving window of 10 days in Cassandra while dropping the old data… 
Or, if there is any other recommended way to deal with such sliding time 
windows I am open for ideas.

Thank you for your help!

Cassandra and massive TTL expirations cause HEAP issue

Reply via email to