You're welcome, Michalis! Let us know if you need anything else. On Wed, Sep 10, 2025 at 10:56 AM Michalis Kotsiouros (EXT) via user < user@cassandra.apache.org> wrote:
> Hello Aaron, > > Yes indeed, you are right. I will check by using the values as found in > the jvm11-server.options. > > From my lab node I would that increasing the region size to 4m reduced the > number of humongous regions. > > Maybe, it is safer to go for 16m in the production since this will allow > even larger objects to be treated better by the GC. > > I will update you with the outcome from the production node. > > > > Thanks a lot for your valuable replies! > > > > BR > > MK > > > > *From:* Aaron <aaronplo...@gmail.com> > *Sent:* September 10, 2025 18:13 > *To:* user@cassandra.apache.org > *Subject:* Re: Cassandra performance issues after introducing G1 GC > > > > When enabling G1GC in the JVM(11) options file, the expectation is to > comment-out the CMS-related lines and un-comment the G1GC-related lines > (under the "###G1 Settings" header). This would enable the following > settings for G1GC: > > > > -XX:+UseG1GC > -XX:+ParallelRefProcEnabled > -XX:MaxTenuringThreshold=1 > -XX:G1HeapRegionSize=16m > -XX:G1RSetUpdatingPauseTimePercent=5 > -XX:MaxGCPauseMillis=300 > > > > This is a solid, recommended starting point for most workloads. > > > > any previous experience by the user community. > > > > Truth be told, I've configured dozens of Cassandra clusters running with > G1GC, and most did not require these values to be significantly > altered...maybe MaxGCPauseMillis, depending on the workload. > > > > In any case, if G1HeapRegionSize is a *fraction* of what we recommend, > setting that to 16m is where I would start. It also couldn't hurt to > check the remaining settings for any significant deviations from the above. > > > > On Wed, Sep 10, 2025 at 9:40 AM Michalis Kotsiouros (EXT) via user < > user@cassandra.apache.org> wrote: > > Hello Aaron, > > No, the -XX:G1HeapRegionSize=16m > > that is present in the default jvm11-server.options file that is delivered > by Cassandra installation is already commented out. > > That is why I am asking if there is any previous experience by the user > community. > > Based on the G1 GC documentation if this parameter is not set then it is > determined automatically during the JVM startup. I do not know how though. > Based on the gc.log, the current value used is 2m. > > > > BR > > MK > > > > *From:* Aaron <aaronplo...@gmail.com> > *Sent:* September 10, 2025 16:36 > *To:* user@cassandra.apache.org > *Subject:* Re: Cassandra performance issues after introducing G1 GC > > > > We saw in the first log line of gc.log that the region size is set to 2M > and after checking a heap dump of the Cassandra process we observed a high > number of objects with 1M size. We concluded that using 4M is a safe option > to start testing. > > > > Isn't the default 16m? I'd be curious to know how that went from 16m to > 2m, to begin with. I see it's commented-out in what you've pasted above. > With it commented-out, perhaps it was being set to 2m implicitly? If that > was set or commented-out in error, I'd try setting it back to 16m, and see > if things improve. > > > > On Wed, Sep 10, 2025 at 7:04 AM Julien Laurenceau < > julien.laurenc...@pepitedata.com> wrote: > > Hi > > I think you may have better luck using jdk17 and ZGC or shenandoah. > As shown by datastax here : > https://www.datastax.com/blog/apache-cassandra-benchmarking-40-brings-heat-new-garbage-collectors-zgc-and-shenandoah > > > Regards > > > > Le Mercredi, Septembre 10, 2025 13:50 CEST, "Michalis Kotsiouros (EXT) via > user" <user@cassandra.apache.org> a écrit: > > > > Hello Cassandra community, > > We are using Cassandra 4.1.5 with the G1 Garbage Collection on Java 11. > > We are using the default G1 settings as found in the jvm11-server.options > delivered by Cassandra installation. Those are: > > ## G1 Settings > > ## Use the Hotspot garbage-first collector. > > -XX:+UseG1GC > > -XX:InitialRAMPercentage=50.0 > > > -Xlog:gc=info,heap*=info,age*=info,safepoint=info,promotion*=info:file=/var/log/cassandra/gc.log:time,uptime,pid,tid,level:filecount=10,filesize=10485760 > > -XX:MaxRAMPercentage=50.0 > > #-XX:+ParallelRefProcEnabled > > ##-XX:MaxTenuringThreshold=1 > > #-XX:G1HeapRegionSize=16m > > > > # > > ## Have the JVM do less remembered set work during STW, instead > > ## preferring concurrent GC. Reduces p99.9 latency. > > #-XX:G1RSetUpdatingPauseTimePercent=5 > > # > > ## Main G1GC tunable: lowering the pause target will lower throughput and > vise versa. > > ## 200ms is the JVM default and lowest viable setting > > ## 1000ms increases throughput. Keep it smaller than the timeouts in > cassandra.yaml. > > -XX:MaxGCPauseMillis=200 > > > > ## Optional G1 Settings > > # Save CPU time on large (>= 16GB) heaps by delaying region scanning > > # until the heap is 70% full. The default in Hotspot 8u40 is 40%. > > -XX:InitiatingHeapOccupancyPercent=70 > > > > # For systems with > 8 cores, the default ParallelGCThreads is 5/8 the > number of logical cores. > > # Otherwise equal to the number of cores when 8 or less. > > # Machines with > 10 cores should try setting these to <= full cores. > > #-XX:ParallelGCThreads=16 > > # By default, ConcGCThreads is 1/4 of ParallelGCThreads. > > # Setting both to the same value can reduce STW durations. > > #-XX:ConcGCThreads=16 > > > > We have observed that in the production deployment that the Cassandra > underperforms occasionally. Based on the logs analysis, we could correlate > the underperformance – slow operations and dropped internal messages – with > frequent long GC pauses. Then the Cassandra node would stop due to OOM and > after the restart, the performance would be ok for some days. After 3-4 > days, we would see the same behavior again and the system recovering after > the restart due to OOM. > > We used to have our application using Cassandra 3.11 with CMS GC and we > would not see such behavior. > > We have been checking the gc.log and we observed that the number of > humongous regions were reaching around 3K which is definitely not normal > for G1. > > After some research about the G1 Garbage collection, we tried to increase > the region size using the -XX:G1HeapRegionSize JVM option. > > We saw in the first log line of gc.log that the region size is set to 2M > and after checking a heap dump of the Cassandra process we observed a high > number of objects with 1M size. We concluded that using 4M is a safe option > to start testing. Our first test from the test deployment shows a > tremendous reduction of the number of humongous regions. That is from ~1.5K > to ~20. > > Has anyone else in the community observed similar issues before when using > the G1 GC? > > Do you consider that the setting of the Heap Region Size is application > dependent or it depends on the Cassandra internal design? > > If the region size setting mostly depends on the Cassandra internal > design, is there any general recommendation that would cover the majority > of applications? > > > > BR > > MK > > > > > > > > >