Thanks Bowen and Jon for the clarification and suggestions! I will go
through them and dig more.

Yes, the JVM heap size is fixed and I can see it is allocated at all times.
The spikes I am referring to happen in addition to heap allocated memory.
I had tuned heap settings to resolve GC pause issues in the past (Thanks to
Jon for the blog on GC tuning!), but the spikes existed even before that.
So page cache might be the cause here.


-vignesh

On Wed, Feb 26, 2025 at 9:01 PM Jon Haddad <j...@rustyrazorblade.com> wrote:

> Can you explain a bit more what you mean by memory spikes?
>
> The defaults we ship use the same settings for min and max JVM heap size,
> so you should see all the memory allocated to the JVM at startup.  Did you
> change anything here?  I don't recommend doing so.
>
> If you're referring to files in the page cache, that's unavoidable
> (today)  Compaction reads in data and creates new files, and we do that
> through the pread system call, which goes through the page cache.  This
> could be eliminated if we move compaction to direct io [1], but that's not
> something that can be done through config.  There's no way to eliminate
> files being loaded into the page cache right now, and trying to do so means
> fighting your OS and the optimizations it provides.  vmstat [2] can be
> helpful to understand what's going on here.  There's a good
> reference, Linux Performance Analysis in 60,000 Milliseconds [3], which
> shows how to use it as well as other tools.
>
> Jon
>
> [1] https://issues.apache.org/jira/browse/CASSANDRA-19987
> [2] https://www.man7.org/linux/man-pages/man8/vmstat.8.html
> [3]
> https://netflixtechblog.com/linux-performance-analysis-in-60-000-milliseconds-accc10403c55
>
>
> On Wed, Feb 26, 2025 at 4:30 AM vignesh s <vigneshclou...@gmail.com>
> wrote:
>
>> *Setup:*
>> I have a Cassandra cluster running in 3 datacenters with 3 nodes each
>> (total 9 nodes), hosted on GCP.
>> • *Replication Factor:* 3-3-3
>> • *Compaction Strategy:* LeveledCompactionStrategy
>> • *Heap Memory:* 10 GB (Total allocated memory: 32 GB)
>> • *Off-heap Memory:* around 4 GB
>> • *Workload:* ~1.5K writes/s per node, ~100 reads/s per node (both using
>> LOCAL_QUORUM)
>> *Issue:*
>> I am observing short-lived memory spikes where total memory usage jumps
>> from 44% to 85%. These occur periodically and last for a short period.
>> After monitoring tpstats, I noticed that compaction threads are running
>> during these spikes.
>> While I understand that compaction is a fundamental process, these memory
>> spikes make capacity planning difficult.
>> I tried adjusting the following settings, but they did not have any
>> effect on the spikes:
>> • compaction_throughput_mb_per_sec
>> • concurrent_compactors
>> *Questions:*
>> 1. Are there other settings I can tune to reduce memory spikes?
>> 2. Could something else be causing these spikes apart from compaction?
>>
>> Would appreciate any insights on how to smooth out memory usage.
>>
>> - vignesh
>>
>

Reply via email to