Sebastian, Because I have this impression that LCS is IO intensive and it's recommended only on SSDs. So I am curious to see how far it can stress those SSDs. But it turns out the most expensive part about LCS is not IO bound but CUP bound, or more precisely single core speed bound. This is a little surprising.
Of course LCS is still superior in other aspects. On Jan 15, 2016 6:34 PM, "Sebastian Estevez" <sebastian.este...@datastax.com> wrote: > Correct. > > Why are you concerned with the raw throughput, are you accumulating > pending compactions? Are you seeing high sstables per read statistics? > > all the best, > > Sebastián > On Jan 15, 2016 6:18 PM, "Kai Wang" <dep...@gmail.com> wrote: > >> Jeff & Sebastian, >> >> Thanks for the reply. There are 12 cores but in my case C* only uses one >> core most of the time. *nodetool compactionstats* shows there's only one >> compactor running. I can see C* process only uses one core. So I guess I >> should've asked the question more clearly: >> >> 1. Is ~25 M/s a reasonable compaction throughput for one core? >> 2. Is there any configuration that affects single core compaction >> throughput? >> 3. Is concurrent_compactors the only option to parallelize compaction? If >> so, I guess it's the compaction strategy itself that decides when to >> parallelize and when to block on one core. Then there's not much we can do >> here. >> >> Thanks. >> >> On Fri, Jan 15, 2016 at 5:23 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com> >> wrote: >> >>> With SSDs, the typical recommendation is up to 0.8-1 compactor per core >>> (depending on other load). How many CPU cores do you have? >>> >>> >>> From: Kai Wang >>> Reply-To: "user@cassandra.apache.org" >>> Date: Friday, January 15, 2016 at 12:53 PM >>> To: "user@cassandra.apache.org" >>> Subject: compaction throughput >>> >>> Hi, >>> >>> I am trying to figure out the bottleneck of compaction on my node. The >>> node is CentOS 7 and has SSDs installed. The table is configured to use >>> LCS. Here is my compaction related configs in cassandra.yaml: >>> >>> compaction_throughput_mb_per_sec: 160 >>> concurrent_compactors: 4 >>> >>> I insert about 10G of data and start observing compaction. >>> >>> *nodetool compaction* shows most of time there is one compaction. >>> Sometimes there are 3-4 (I suppose this is controlled by >>> concurrent_compactors). During the compaction, I see one CPU core is 100%. >>> At that point, disk IO is about 20-25 M/s write which is much lower than >>> the disk is capable of. Even when there are 4 compactions running, I see >>> CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool >>> setcompactionthroughput 0* to disable the compaction throttle but don't >>> see any difference. >>> >>> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is >>> there anyway to improve the throughput? >>> >>> Thanks. >>> >> >>