> > We saw in the first log line of gc.log that the region size is set to 2M > and after checking a heap dump of the Cassandra process we observed a high > number of objects with 1M size. We concluded that using 4M is a safe option > to start testing.
Isn't the default 16m? I'd be curious to know how that went from 16m to 2m, to begin with. I see it's commented-out in what you've pasted above. With it commented-out, perhaps it was being set to 2m implicitly? If that was set or commented-out in error, I'd try setting it back to 16m, and see if things improve. On Wed, Sep 10, 2025 at 7:04 AM Julien Laurenceau < julien.laurenc...@pepitedata.com> wrote: > Hi > > I think you may have better luck using jdk17 and ZGC or shenandoah. > As shown by datastax here : > https://www.datastax.com/blog/apache-cassandra-benchmarking-40-brings-heat-new-garbage-collectors-zgc-and-shenandoah > > > Regards > > > > Le Mercredi, Septembre 10, 2025 13:50 CEST, "Michalis Kotsiouros (EXT) via > user" <user@cassandra.apache.org> a écrit: > > > > Hello Cassandra community, > > We are using Cassandra 4.1.5 with the G1 Garbage Collection on Java 11. > > We are using the default G1 settings as found in the jvm11-server.options > delivered by Cassandra installation. Those are: > > ## G1 Settings > > ## Use the Hotspot garbage-first collector. > > -XX:+UseG1GC > > -XX:InitialRAMPercentage=50.0 > > > -Xlog:gc=info,heap*=info,age*=info,safepoint=info,promotion*=info:file=/var/log/cassandra/gc.log:time,uptime,pid,tid,level:filecount=10,filesize=10485760 > > -XX:MaxRAMPercentage=50.0 > > #-XX:+ParallelRefProcEnabled > > ##-XX:MaxTenuringThreshold=1 > > #-XX:G1HeapRegionSize=16m > > > > # > > ## Have the JVM do less remembered set work during STW, instead > > ## preferring concurrent GC. Reduces p99.9 latency. > > #-XX:G1RSetUpdatingPauseTimePercent=5 > > # > > ## Main G1GC tunable: lowering the pause target will lower throughput and > vise versa. > > ## 200ms is the JVM default and lowest viable setting > > ## 1000ms increases throughput. Keep it smaller than the timeouts in > cassandra.yaml. > > -XX:MaxGCPauseMillis=200 > > > > ## Optional G1 Settings > > # Save CPU time on large (>= 16GB) heaps by delaying region scanning > > # until the heap is 70% full. The default in Hotspot 8u40 is 40%. > > -XX:InitiatingHeapOccupancyPercent=70 > > > > # For systems with > 8 cores, the default ParallelGCThreads is 5/8 the > number of logical cores. > > # Otherwise equal to the number of cores when 8 or less. > > # Machines with > 10 cores should try setting these to <= full cores. > > #-XX:ParallelGCThreads=16 > > # By default, ConcGCThreads is 1/4 of ParallelGCThreads. > > # Setting both to the same value can reduce STW durations. > > #-XX:ConcGCThreads=16 > > > > We have observed that in the production deployment that the Cassandra > underperforms occasionally. Based on the logs analysis, we could correlate > the underperformance – slow operations and dropped internal messages – with > frequent long GC pauses. Then the Cassandra node would stop due to OOM and > after the restart, the performance would be ok for some days. After 3-4 > days, we would see the same behavior again and the system recovering after > the restart due to OOM. > > We used to have our application using Cassandra 3.11 with CMS GC and we > would not see such behavior. > > We have been checking the gc.log and we observed that the number of > humongous regions were reaching around 3K which is definitely not normal > for G1. > > After some research about the G1 Garbage collection, we tried to increase > the region size using the -XX:G1HeapRegionSize JVM option. > > We saw in the first log line of gc.log that the region size is set to 2M > and after checking a heap dump of the Cassandra process we observed a high > number of objects with 1M size. We concluded that using 4M is a safe option > to start testing. Our first test from the test deployment shows a > tremendous reduction of the number of humongous regions. That is from ~1.5K > to ~20. > > Has anyone else in the community observed similar issues before when using > the G1 GC? > > Do you consider that the setting of the Heap Region Size is application > dependent or it depends on the Cassandra internal design? > > If the region size setting mostly depends on the Cassandra internal > design, is there any general recommendation that would cover the majority > of applications? > > > > BR > > MK > > > > > > > >