If possible, prefer m5 over m4, cause they are running on a newer hypervisor 
(KVM-based), single core performance is ~ 10% better compared to m4 with m5 
even being slightly cheaper than m4.

Thomas

From: Erick Ramirez <flightc...@gmail.com>
Sent: Donnerstag, 30. Jänner 2020 03:00
To: user@cassandra.apache.org
Subject: Re: Cassandra going OOM due to tombstones (heapdump screenshots 
provided)

It looks like the number of tables is the problem, with 5,000 - 10,000 tables, 
that is way above the recommendations.
Take a look here: 
https://docs.datastax.com/en/dse-planning/doc/planning/planningAntiPatterns.html#planningAntiPatterns__AntiPatTooManyTables<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.datastax.com%2Fen%2Fdse-planning%2Fdoc%2Fplanning%2FplanningAntiPatterns.html%23planningAntiPatterns__AntiPatTooManyTables&data=02%7C01%7Cthomas.steinmaurer%40dynatrace.com%7C91073328bfa645b161d208d7a52928ba%7C70ebe3a35b30435d9d677716d74ca190%7C1%7C0%7C637159468506744804&sdata=P7Xaen73WT2uVzry5Aq8%2FNfEB25QeUA1r3iR9NJemB4%3D&reserved=0>
This suggests that 5-10GB of heap is going to be taken up just with the table 
information ( 1MB per table )

+1 to Paul Chandler & Hannu Kröger. Although there isn't a hard limit on the 
maximum number of tables, there's a reasonable number that is operationally 
sound and we recommend that 200 total tables per cluster is the sweet spot. We 
know from experience that the clusters suffer as the total number of tables 
approaches 400+ so stick as close to 200 as possible. I had these 
recommendations published in the DataStax Docs a couple of years ago to provide 
clear guidance to users.

1000 keyspaces suggests that you have a multi-tenant setup. Perhaps you can 
distribute the keyspaces across multiple clusters so each cluster has less than 
500 tables. To be clear, the number of keyspaces isn't relevant in this context 
-- it's the total number of tables across all keyspaces that matters.

- We observed this problem on a c4.4xlarge (AWS EC2) instance having 30GB RAM 
with 8GB heap
- We observed the same problem on a c4.8xlarge having 60GB RAM with 12GB heap

A little off-topic but it sounds like you've been evaluating different instance 
types. The c4 instances may not be ideal for your circumstances because you're 
trading less RAM for more powerful CPUs. I generally recommend m4 instances 
because they're a good balance of CPU and RAM for the money. In a m4.4xlarge 
configuration, what you lose in raw CPU power over a c4.4xlarge (2.4GHz Intel 
Xeon E5-2676 vs 2.9GHz E5-2666) you gain 34GB of RAM (64GB vs 30GB) for nearly 
identical pricing. I think the m4 type is better value compared to c4. YMMV but 
run your tests and you might be surprised.

In relation to the heap, I imagine you're using CMS so allocate at least 16GB 
but 20 or 24GB might turn out to be the ideal size for your cluster based on 
your testing. Just make sure you reserve at least 8GB of RAM for the operating 
system.

I hope this helps. Cheers!
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4020 Linz, Austria, Am 
Fünfundzwanziger Turm 20

Reply via email to