> Our JVM options are unchanged between 2.2 and 3.11
>>
>
> For the sake of clarity, do you mean:
> (a) you're using the default JVM options in 3.11 and it's different to the
> options you had in 2.2?
> (b) you've copied the same JVM options you had in 2.2 to 3.11?
>

(b), which are the default options from 2.2 (and I believe the default
options in 3.11 from a brief glance).

Copied here for clarity, though I'm skeptical that GC settings are actually
a cause here because I would expect them to only impact the upgraded node
and not the cluster overall.

### CMS Settings
-XX:+UseParNewGC
XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1
XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSWaitDuration=10000
-XX:+CMSParallelInitialMarkEnabled
-XX:+CMSEdenChunksRecordAlways
XX:+CMSClassUnloadingEnabled


> The distinction is important because at the moment, you need to go through
> a process of elimination to identify the cause.
>
>
>> Read throughput (rate, bytes read/range scanned, etc.) seems fairly
>> consistent before and after the upgrade across all nodes.
>>
>
> What I was trying to get at is whether the upgraded node was getting hit
> with more traffic compared to the other nodes since it will indicate that
> the longer GCs are just the symptom, not the cause.
>
>
I don't see any distinct change, nor do I see an increase in traffic to the
upgraded node that would result in longer GC pauses.  Frankly I don't see
any changes or aberrations in client-related metrics at all that correlate
to the GC pauses, except for the corresponding timeouts.

Reply via email to