On Tue, Apr 26, 2011 at 9:01 AM, Terje Marthinussen <tmarthinus...@gmail.com> wrote: > Hi, > I was testing the multithreaded compactions and with 2x6 cores (24 with HT) > it does seem a bit crazy with 24 compactions running concurrently. > It is probably not very good in terms of random I/O.
It does seems a bit overkill. However, I'm slightly curious how you ended up with 24 parallel compactions, more precisely, how did you end up with enough sstables to trigger 24 compactions ? Was that done on purpose for testing sake, or did you saw that in a real situation ? I'm asking because in 'real' situation, given that compaction are triggered only if there is some number of files to compact, and provided the cluster is correctly provisioned, I wouldn't expect the number of parallel compaction to jump to such numbers (one of the goal of multi_treaded compaction was to make sure we never end up accumulating lots of un-compacted sstables). Anyway, I get your point, just wondering if that was a real situation. > As such, I think I agree with the argument in 2191 that there should be a > config option for this. > Probably a default that is dynamic with 1 thread per column family +2 or 3 > thread for parallel compactions outside of that could be good. > Any other opinions? Multi-threaded compaction is optional and compaction throttling is supposed to mitigage it. However I do agree that too much many compactions may be a bad use of resources because of random IO even if correctly throttled. I think it's missing a configuration option "concurrent_compactions" like there is a "concurrent_writes|reads". For that, I have created https://issues.apache.org/jira/browse/CASSANDRA-2558 > I guess the compaction thread pool should also show up in tpstats? Yes it should ... and it will ... eventually :) Thanks for the feedback. -- Sylvain