On Thu, Jun 13, 2019 at 10:36 AM R. T. <rastr...@protonmail.com.invalid> wrote:
> > Well, actually by running cfstats I can see that the totaldiskspaceused is > about ~ 1.2 TB per node in the DC1 and ~ 1 TB per node in DC2. DC2 was off > for a while thats why there is a difference in space. > > I am using Cassandra 3.0.6 and > my stream_throughput_outbound_megabits_per_sec is th4e default setting so > according to my version is (200 Mbps or 25 MB/s) > And the other setting: compaction_throughput_mb_per_sec? It is also highly relevant for repair performance, as streamed in files need to be compacted with the existing files on the nodes. In our experience change in compaction throughput limit is almost linearly reflected by the repair run time. The default 16 MB/s is too limiting for any production grade setup, I believe. We go as high as 90 MB/s on AWS EBS gp2 data volumes. But don't take it as a gospel, I'd suggest you start increasing the setting (e.g. by doubling it) and observe how it affects repair performance (and client latencies). Have you tried with "parallel" instead of "DC parallel" mode? The latter one is really poorly named and it actually means something else, as neatly highlighted in this SO answer: https://dba.stackexchange.com/a/175028 Last, but not least: are you using the default number of vnodes, 256? The overhead of large number of vnodes (times the number of nodes), can be quite significant. We've seen major improvements in repair runtime after switching from 256 to 16 vnodes on Cassandra version 3.0. Cheers, -- Alex