> we are currently using 3.0.9. should we use 3.8 or 3.10
No, don't use 3.X in production unless you really need a major feature.I would
advise to stick to 3.0.X (i.e. 3.0.11 now).You can backport CASSANDRA-11966
easily but of course you have to deploy from source as a prerequisite.
> I haven't done any tuning yet.
So it's a good news because maybe there is room for improvement
> Can I change this on a running instance? If so, how? or does it require a
> downtime?
You can throttle compaction at runtime with "nodetool setcompactionthroughput".
Be sure to read all nodetool commmands, some of them are really useful for a
day to day tuning/management.
If GC is fine, then check other things -> "[...] different pool sizes for NTR,
concurrent reads and writes, compaction executors, etc. Also check if you can
improve network latency (e.g. VF or ENA on AWS)."
Regarding thread pools, some of them can be resized at runtime via JMX.
> 5000 is the target.
Right now you reached 1500. Is it per node or for the cluster?We don't know
your setup so it's hard to say it's doable. Can you provide more details? VM,
physical nodes, #nodes, etc.Generally speaking LWT should be seldom used. AFAIK
you won't achieve 10,000 writes/s per node.
Maybe someone on the list already made some tuning for heavy LWT workload?
Best,
Romain