I'll capture what I we're seeing here for anyone else who may look
into this in more detail later.

Our standard heap growth is ~300K in between collections with regular
ParNew collections happening on average about every 4 seconds. All
very healthy.

The memtable flush (where we see almost all our CMS activity) seems to
have some balloon effect that despite a 64MB memtable size, causes
over 512MB heap to be consumed in half a second. In addition to the
hefty amount of garbage it causes, due to the MaxTenuringThreshold=1
setting most of that garbage seems to spill immediately into the
tenured generation which quickly fills and triggers a CMS. The rate of
garbage overflowing to tenured seems to outstrip the speed of the
concurrent mark worker which is almost always interrupted and failed
to a concurrent collection. However, the tenured collection is usually
hugely effective, recovering over half the total heap.

Two questions for the group then:

1) Does this seem like a sane amount of garbage (512MB) to generate
when flushing a 64MB table to disk?
2) Is this possibly a case of the MaxTenuringThreshold=1 working
against cassandra? The flush seems to create a lot of garbage very
quickly such that normal CMS isn't even possible. I'm sure there was a
reason to introduce this setting but I'm not sure it's universally
beneficial. Is there any history on the decision to opt for immediate
promotion rather than using an adaptable number of survivor
generations?

Reply via email to