The section *"Why does top report that Cassandra is using a lot more memory than the Java heap max?" *on the page https://cassandra.apache.org/doc/latest/cassandra/overview/faq/index.html can provide some useful information. I have seen Cassandra tries to take all the available free RAM using mapped files. For example if I have a 16G heap and 32G full RAM, memory usage will go till 32GB. And if I have 16G headp and 64G full RAM, memory usage spikes to 50-56G. This happens mostly for reads in our case when we are generating reports via spark.
Regards Manish On Thu, Feb 27, 2025 at 2:07 PM Dmitry Konstantinov <netud...@gmail.com> wrote: > I would recommend checking first for what exact metric do you have drops: > free memory or available memory. There is a common delusion about free vs > available memory in Linux: https://www.linuxatemyram.com/ > Overwise if you really have spikes in *used* memory and these are spikes > in memory used by Cassandra process itself then I would try to > decompose the memory usage: as Jon mentioned the default and expected > configuration for Java heap is Xmx=Xms + -XX:+AlwaysPreTouch, so the memory > for Java heap is allocated in advance and not changing dynamically from OS > point of view if the settings are in place. In this case there should be > some other memory region growing, here there are some resources which can > help to figure out which one: > https://www.youtube.com/watch?v=c755fFv1Rnk > https://shipilev.net/jvm/anatomy-quarks/12-native-memory-tracking/ > https://blog.arkey.fr/2020/11/30/off-heap-reconnaissance/ > > Regards, > Dmitry > > > On Thu, 27 Feb 2025 at 09:31, vignesh s <vigneshclou...@gmail.com> wrote: > >> Thanks Bowen and Jon for the clarification and suggestions! I will go >> through them and dig more. >> >> Yes, the JVM heap size is fixed and I can see it is allocated at all >> times. The spikes I am referring to happen in addition to heap allocated >> memory. >> I had tuned heap settings to resolve GC pause issues in the past (Thanks >> to Jon for the blog on GC tuning!), but the spikes existed even before >> that. So page cache might be the cause here. >> >> >> -vignesh >> >> On Wed, Feb 26, 2025 at 9:01 PM Jon Haddad <j...@rustyrazorblade.com> >> wrote: >> >>> Can you explain a bit more what you mean by memory spikes? >>> >>> The defaults we ship use the same settings for min and max JVM heap >>> size, so you should see all the memory allocated to the JVM at startup. >>> Did you change anything here? I don't recommend doing so. >>> >>> If you're referring to files in the page cache, that's unavoidable >>> (today) Compaction reads in data and creates new files, and we do that >>> through the pread system call, which goes through the page cache. This >>> could be eliminated if we move compaction to direct io [1], but that's not >>> something that can be done through config. There's no way to eliminate >>> files being loaded into the page cache right now, and trying to do so means >>> fighting your OS and the optimizations it provides. vmstat [2] can be >>> helpful to understand what's going on here. There's a good >>> reference, Linux Performance Analysis in 60,000 Milliseconds [3], which >>> shows how to use it as well as other tools. >>> >>> Jon >>> >>> [1] https://issues.apache.org/jira/browse/CASSANDRA-19987 >>> [2] https://www.man7.org/linux/man-pages/man8/vmstat.8.html >>> [3] >>> https://netflixtechblog.com/linux-performance-analysis-in-60-000-milliseconds-accc10403c55 >>> >>> >>> On Wed, Feb 26, 2025 at 4:30 AM vignesh s <vigneshclou...@gmail.com> >>> wrote: >>> >>>> *Setup:* >>>> I have a Cassandra cluster running in 3 datacenters with 3 nodes each >>>> (total 9 nodes), hosted on GCP. >>>> • *Replication Factor:* 3-3-3 >>>> • *Compaction Strategy:* LeveledCompactionStrategy >>>> • *Heap Memory:* 10 GB (Total allocated memory: 32 GB) >>>> • *Off-heap Memory:* around 4 GB >>>> • *Workload:* ~1.5K writes/s per node, ~100 reads/s per node (both >>>> using LOCAL_QUORUM) >>>> *Issue:* >>>> I am observing short-lived memory spikes where total memory usage jumps >>>> from 44% to 85%. These occur periodically and last for a short period. >>>> After monitoring tpstats, I noticed that compaction threads are running >>>> during these spikes. >>>> While I understand that compaction is a fundamental process, these >>>> memory spikes make capacity planning difficult. >>>> I tried adjusting the following settings, but they did not have any >>>> effect on the spikes: >>>> • compaction_throughput_mb_per_sec >>>> • concurrent_compactors >>>> *Questions:* >>>> 1. Are there other settings I can tune to reduce memory spikes? >>>> 2. Could something else be causing these spikes apart from compaction? >>>> >>>> Would appreciate any insights on how to smooth out memory usage. >>>> >>>> - vignesh >>>> >>> > > -- > Dmitry Konstantinov >