There is a forced flusher that kicks in when your heap becomes full. Look for log lines from GCInspector. There is a bug that prevents flushing memtable when it has only full key delete mutations, see https://issues.apache.org/jira/browse/CASSANDRA-3741 For me it happened when we've started to move to new schema, so that old column families started to receive delete only operations. An indications is when GCInspector can't flush anything but system keyspace.

21.03.12 17:29, A J написав(ла):
I have increased index_interval. Will let you know if I see a difference.


My theory is that memtables are not getting flushed. If I manually
flush them, the heap consumption goes down drastically.

I think when memtable_total_space_in_mb is exceeded not enough
memtables are getting flushed. There are 5000 memtables (one for each
CF) but each memtable in itself is small. So flushing of one or two
memtable by Cassandra is not helping.

Question: How many memtables are flushed when
memtable_total_space_in_mb is exceeded ? Any way to flush all
memtables when the threshold is reached ?

Thanks.

On Wed, Mar 21, 2012 at 8:56 AM, Vitalii Tymchyshyn<tiv...@gmail.com>  wrote:
Hello.

There is also a primary row index. It's space can be controlled with
index_interval setting. Don't know if you can look for it's memory usage
somewhere. If I where you, I'd take jmap tool and examine heap histogram
first, heap dump second.

Best regards, Vitalii Tymchyshyn

20.03.12 18:12, A J написав(ла):

I have both row cache and column cache disabled for all my CFs.

cfstats says "Bloom Filter Space Used: 1760" per CF. Assuming it is in
bytes, it is total of about 9MB of bloom filter size for 5K CFs; which
is not a lot.


On Tue, Mar 20, 2012 at 11:09 AM, Vitalii Tymchyshyn<tiv...@gmail.com>
  wrote:
Hello.

  From my experience it's unwise to make many column families for same
keys
because you will have bloom filters and row indexes multiplied. If you
have
5000, you should expect your heap requirements multiplied by same factor.
Also check your cache sizes. Default AFAIR is 100000 keys per column
family.

20.03.12 16:05, A J написав(ла):

ok, the last thread says that 1.0+ onwards, thousands of CFs should
not be a problem.

But I am finding that all the allocated heap memory is getting consumed.
I started with 8GB heap and then on reading


http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management
realized that minimum of 1MB per memtable is used by the per-memtable
arena allocator.
So with 5K CFs, 5GB will be used just by arena allocators.

But even on increasing the heap to 16GB, am finding that all the heap
is getting consumed. Is there a different formula for heap calculation
when you have thousands of CFs ?
Any other configuration that I need to change ?

Thanks.

On Mon, Mar 19, 2012 at 10:35 AM, Alain RODRIGUEZ<arodr...@gmail.com>
  wrote:
This subject was already discussed, this may help you :


http://markmail.org/message/6dybhww56bxvufzf#query:+page:1+mid:6dybhww56bxvufzf+state:results

If you still got questions after reading this thread or some others
about
the same topic, do not hesitate asking again,

Alain


2012/3/19 A J<s5a...@gmail.com>
How many Column Families are one too many for Cassandra ?
I created a db with 5000 CFs (I can go into the reasons later) but the
latency seems to be very erratic now. Not sure if it is because of the
number of CFs.

Thanks.


Reply via email to