Hello, I completely agree with Elliott above on the observation that it is hard to say what *this cluster *needs. Yet, my colleague Jon wrote a small guide on how to tune this in most cases or as a starting point let's say. We often write post were we see repetitive question in the mailing list, this gives us hints on what are good topics to cover writing a post (not to repeat here every day the same :)). This was most definitely one of the most 'demanded' topic and I believe this might be really helpful to you https://thelastpickle.com/blog/2018/04/11/gc-tuning.html.
No one starts working on garbage collection unless they have to, because it's for many a different world they don't know about. It was my case, I did not touch GC for 2 years when I started. But really, you'll see that we can reason about GC and make things way better in some case. The first time I changed GC, I divided the cluster size by 2 and still divided latency by 2. So improvement were substantial, it was worth the interest we put into it. In addition, to break the ice with GC, I found that using http://gceasy.io to be an excellent way to monitor/troubleshoot GC. Feed it with some gc logs and it will give you the GC throughput (% of time JVM is available - not doing a 'stop the world' pause). To give you some numbers, this should be > 95-98% minimum. If you are having a lower throughput, chances are hight that you can 'easily' improve performances there. There is a lot more details in this analysis that might help you making your head around GC and tune it properly. I generally prefer using CMS, but saw some very successful cluster using G1GC. G1GC is known to work better with bigger heaps. If you're going to use 8 GB (or even 16 GB) for the heap, I would stick to CMS and tune it properly most probably, but again, G1GC might work quite well, with way less efforts if you can assign 16+GB to the heap let's say. Work on a canary node (only one random node) while changing this, then observe logs with GCeasy. Repeat until you're happy with it (I would be happy with about 95% to 98% of GC throughput (ie 2 to 5 % of pauses). But what really matters is that after the changes you have better latency/less dropped messages etc. You can measure the impact in GC throughput. When the workload seems to be optimised enough / you're tired of playing with GC, you can apply changes everywhere and observe impact on the cluster (latency/dropped messages/CPU load...) Hope that helps and completes somewhat Elliott's excellent answer. ----------------------- Alain Rodriguez - al...@thelastpickle.com France / Spain The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com Le lun. 20 mai 2019 à 23:31, Elliott Sims <elli...@backblaze.com> a écrit : > It's not really something that can be easily calculated based on write > rate, but more something you have to find empirically and adjust > periodically. > Generally speaking, I'd start by running "nodetool gcstats" or similar and > just see what the GC pause stats look like. If it's not pausing much or > for long, you're good. If it is, you'll likely need to do some tuning > based on GC logging which may involve increasing the heap but could also > mean decreasing it or changing the collection strategy. > > Generally speaking, with G1GC you can get away with just setting a larger > heap than you really need and it's close enough to optimal. CMS is > theoretically more efficient, but far more complex to get tuned properly > and tends to fail more dramatically. > > On Mon, May 20, 2019 at 7:38 AM Akshay Bhardwaj < > akshay.bhardwaj1...@gmail.com> wrote: > >> Hi Experts, >> >> I have a 5 node cluster with 8 core CPU and 32 GiB RAM >> >> If I have a write TPS of 5K/s and read TPS of 8K/s, I want to know what >> is the optimal heap size configuration for each cassandra node. >> >> Currently, the heap size is set at 8GB. How can I know if cassandra >> requires more or less heap memory? >> >> Akshay Bhardwaj >> +91-97111-33849 >> >