Re: Cassandra performance issues after introducing G1 GC

Aaron Sat, 13 Sep 2025 10:53:04 -0700

When enabling G1GC in the JVM(11) options file, the expectation is to
comment-out the CMS-related lines and un-comment the G1GC-related lines
(under the "###G1 Settings" header). This would enable the following
settings for G1GC:


-XX:+UseG1GC
-XX:+ParallelRefProcEnabled
-XX:MaxTenuringThreshold=1
-XX:G1HeapRegionSize=16m
-XX:G1RSetUpdatingPauseTimePercent=5
-XX:MaxGCPauseMillis=300

This is a solid, recommended starting point for most workloads.

any previous experience by the user community.


Truth be told, I've configured dozens of Cassandra clusters running with
G1GC, and most did not require these values to be significantly
altered...maybe MaxGCPauseMillis, depending on the workload.

In any case, if G1HeapRegionSize is a *fraction* of what we recommend,
setting that to 16m is where I would start. It also couldn't hurt to check
the remaining settings for any significant deviations from the above.

On Wed, Sep 10, 2025 at 9:40 AM Michalis Kotsiouros (EXT) via user <
user@cassandra.apache.org> wrote:

> Hello Aaron,
>
> No, the -XX:G1HeapRegionSize=16m
>
> that is present in the default jvm11-server.options file that is delivered
> by Cassandra installation is already commented out.
>
> That is why I am asking if there is any previous experience by the user
> community.
>
> Based on the G1 GC documentation if this parameter is not set then it is
> determined automatically during the JVM startup. I do not know how though.
> Based on the gc.log, the current value used is 2m.
>
>
>
> BR
>
> MK
>
>
>
> *From:* Aaron <aaronplo...@gmail.com>
> *Sent:* September 10, 2025 16:36
> *To:* user@cassandra.apache.org
> *Subject:* Re: Cassandra performance issues after introducing G1 GC
>
>
>
> We saw in the first log line of gc.log that the region size is set to 2M
> and after checking a heap dump of the Cassandra process we observed a high
> number of objects with 1M size. We concluded that using 4M is a safe option
> to start testing.
>
>
>
> Isn't the default 16m? I'd be curious to know how that went from 16m to
> 2m, to begin with. I see it's commented-out in what you've pasted above.
> With it commented-out, perhaps it was being set to 2m implicitly? If that
> was set or commented-out in error, I'd try setting it back to 16m, and see
> if things improve.
>
>
>
> On Wed, Sep 10, 2025 at 7:04 AM Julien Laurenceau <
> julien.laurenc...@pepitedata.com> wrote:
>
> Hi
>
> I think you may have better luck using jdk17 and ZGC or shenandoah.
> As shown by datastax here :
> https://www.datastax.com/blog/apache-cassandra-benchmarking-40-brings-heat-new-garbage-collectors-zgc-and-shenandoah
>
>
> Regards
>
>
>
> Le Mercredi, Septembre 10, 2025 13:50 CEST, "Michalis Kotsiouros (EXT) via
> user" <user@cassandra.apache.org> a écrit:
>
>
>
> Hello Cassandra community,
>
> We are using Cassandra 4.1.5 with the G1 Garbage Collection on Java 11.
>
> We are using the default G1 settings as found in the jvm11-server.options
> delivered by Cassandra installation. Those are:
>
> ## G1 Settings
>
> ## Use the Hotspot garbage-first collector.
>
> -XX:+UseG1GC
>
> -XX:InitialRAMPercentage=50.0
>
>
> -Xlog:gc=info,heap*=info,age*=info,safepoint=info,promotion*=info:file=/var/log/cassandra/gc.log:time,uptime,pid,tid,level:filecount=10,filesize=10485760
>
> -XX:MaxRAMPercentage=50.0
>
> #-XX:+ParallelRefProcEnabled
>
> ##-XX:MaxTenuringThreshold=1
>
> #-XX:G1HeapRegionSize=16m
>
>
>
> #
>
> ## Have the JVM do less remembered set work during STW, instead
>
> ## preferring concurrent GC. Reduces p99.9 latency.
>
> #-XX:G1RSetUpdatingPauseTimePercent=5
>
> #
>
> ## Main G1GC tunable: lowering the pause target will lower throughput and
> vise versa.
>
> ## 200ms is the JVM default and lowest viable setting
>
> ## 1000ms increases throughput. Keep it smaller than the timeouts in
> cassandra.yaml.
>
> -XX:MaxGCPauseMillis=200
>
>
>
> ## Optional G1 Settings
>
> # Save CPU time on large (>= 16GB) heaps by delaying region scanning
>
> # until the heap is 70% full. The default in Hotspot 8u40 is 40%.
>
> -XX:InitiatingHeapOccupancyPercent=70
>
>
>
> # For systems with > 8 cores, the default ParallelGCThreads is 5/8 the
> number of logical cores.
>
> # Otherwise equal to the number of cores when 8 or less.
>
> # Machines with > 10 cores should try setting these to <= full cores.
>
> #-XX:ParallelGCThreads=16
>
> # By default, ConcGCThreads is 1/4 of ParallelGCThreads.
>
> # Setting both to the same value can reduce STW durations.
>
> #-XX:ConcGCThreads=16
>
>
>
> We have observed that in the production deployment that the Cassandra
> underperforms occasionally. Based on the logs analysis, we could correlate
> the underperformance – slow operations and dropped internal messages – with
> frequent long GC pauses. Then the Cassandra node would stop due to OOM and
> after the restart, the performance would be ok for some days. After 3-4
> days, we would see the same behavior again and the system recovering after
> the restart due to OOM.
>
> We used to have our application using Cassandra 3.11 with CMS GC and we
> would not see such behavior.
>
> We have been checking the gc.log and we observed that the number of
> humongous regions were reaching around 3K which is definitely not normal
> for G1.
>
> After some research about the G1 Garbage collection, we tried to increase
> the region size using the -XX:G1HeapRegionSize JVM option.
>
> We saw in the first log line of gc.log that the region size is set to 2M
> and after checking a heap dump of the Cassandra process we observed a high
> number of objects with 1M size. We concluded that using 4M is a safe option
> to start testing. Our first test from the test deployment shows a
> tremendous reduction of the number of humongous regions. That is from ~1.5K
> to ~20.
>
> Has anyone else in the community observed similar issues before when using
> the G1 GC?
>
> Do you consider that the setting of the Heap Region Size is application
> dependent or it depends on the Cassandra internal design?
>
> If the region size setting mostly depends on the Cassandra internal
> design, is there any general recommendation that would cover the majority
> of applications?
>
>
>
> BR
>
> MK
>
>
>
>
>
>
>
>
>

Re: Cassandra performance issues after introducing G1 GC

Reply via email to