Re: compaction_throughput_mb_per_sec

Ken Hancock Tue, 05 Jan 2016 06:51:18 -0800

As to why I think it's cluster-wide, here's what the documentation says:

https://docs.datastax.com/en/cassandra/1.2/cassandra/configuration/configCassandra_yaml_r.html
compaction_throughput_mb_per_sec
<https://docs.datastax.com/en/cassandra/1.2/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__compaction_throughput_mb_per_sec>
(Default: 16 ) Throttles compaction to the specified total throughput
across the entire system. The faster you insert data, the faster you need
to compact in order to keep the SSTable count down. The recommended Value
is 16 to 32 times the rate of write throughput (in MBs/second). Setting the
value to 0 disables compaction throttling. Perhaps "across the entire
system" means "across all keyspaces for this Cassandra node"?

Compare the above documentation with the subsequent one which specifically
calls out "a node":

concurrent_compactors
<https://docs.datastax.com/en/cassandra/1.2/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__concurrent_compactors>
(Default: 1 per CPU core**) Sets the number of concurrent compaction
processes allowed to run simultaneously on a node, not including validation
compactions for anti-entropy repair. Simultaneous compactions help preserve
read performance in a mixed read-write workload by mitigating the tendency
of small SSTables to accumulate during a single long-running compaction. If
compactions run too slowly or too fast, change
compaction_throughput_mb_per_sec
<https://docs.datastax.com/en/cassandra/1.2/cassandra/configuration/configCassandra_yaml_r.html#reference_ds_qfg_n1r_1k__compaction_throughput_mb_per_sec>
first. I always thought it was per-node and I'm guessing this is a
documentation lack of clarity issue.

On Mon, Jan 4, 2016 at 5:06 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
wrote:

> Why do you think it’s cluster wide? That param is per-node, and you can
> change it at runtime with nodetool (or via the JMX interface using jconsole
> to ip:7199 )
>
>
>
> From: Ken Hancock
> Reply-To: "user@cassandra.apache.org"
> Date: Monday, January 4, 2016 at 12:59 PM
> To: "user@cassandra.apache.org"
> Subject: compaction_throughput_mb_per_sec
>
> I was surprised the other day to discover that this was a cluster-wide
> setting.   Why does that make sense?
>
> In a heterogeneous cassandra deployment, say I have some old servers
> running spinning disks and I'm bringing on more nodes that perhaps utilize
> SSD.  I want to have different compaction throttling  on different nodes to
> minimize read impact times.
>
> I can already balance data ownership through either token allocation or
> vnode counts.
>
> Also, as I increase my node count, I technically also have to increase my
> compaction_throughput which would require a rolling restart across the
> cluster.
>
>
>

Re: compaction_throughput_mb_per_sec

Reply via email to