I agree with Erick and believe it's most likely a hot partitions issue. I'd check "Compacted partition maximum bytes" in nodetool tablestats on those "affected" nodes and compare the result with the other nodes. I'd also check how the cpu_load is affected. From my experience, during excessive GC the cpu_load is first to show stress, which correlates with dropped mutations/reads that may occur (you can search the logs for such drops). If you run repairs and use local_quorum you might not feel it, but still worth to avoid this. Another thing that comes to mind is a considerable amount of tombstones or updated rows for some partitions that may create unnecessary overhead and increased GC pressure.
Regards, On Thu, Sep 3, 2020 at 1:54 AM Erick Ramirez <erick.rami...@datastax.com> wrote: > That would have been my first response too -- hot partitions. If you know > the partition keys, you can quickly confirm it with nodetool getendpoints. > Cheers! > >>