Tested out multithreaded compaction in 0.8 last night. We had first fed some data with compaction disabled so there was 1000+ sstables on the nodes and I decided to enable multithreaded compaction on one of them to see how it performed vs. nodes that had no compaction at all.
Since this was sort of to see how it could perform, I set throughput to 128MB/sec (knowing that this was probably a bit more than it could manage) It quickly generated 24 tmp files for the main CF (24 compaction threads?), the CPUs got maxed out 90% (2x6 cores) and I started seeing these INFO [FlushWriter:1] 2011-04-24 03:07:46,776 Memtable.java (line 238) Writing Memtable-Test@757483679(23549136/385697094 serialized/live bytes, 32026 ops) WARN [ScheduledTasks:1] 2011-04-24 03:07:46,946 MessagingService.java (line 548) Dropped 36506 MUTATION messages in the last 5000ms INFO [ScheduledTasks:1] 2011-04-24 03:07:46,947 StatusLogger.java (line 50) Pool Name Active Pending INFO [ScheduledTasks:1] 2011-04-24 03:07:46,947 StatusLogger.java (line 65) ReadStage 0 0 INFO [ScheduledTasks:1] 2011-04-24 03:07:46,948 StatusLogger.java (line 65) RequestResponseStage 0 3 INFO [ScheduledTasks:1] 2011-04-24 03:07:46,948 StatusLogger.java (line 65) ReadRepairStage 0 0 INFO [ScheduledTasks:1] 2011-04-24 03:07:46,948 StatusLogger.java (line 65) MutationStage 10 39549 INFO [ScheduledTasks:1] 2011-04-24 03:07:46,949 StatusLogger.java (line 65) ReplicateOnWriteStage 0 0 That the system is a bit overloaded is not really the question (I wanted to find out what it could manage), but the curious part is that when checking tpstats, the Mutation stage was mostly idle, however at seemingly regular intervals it would get massive amounts of mutations. Not sure if it could be related, but the log message always showed up just before the "StatusLogger" printout (but not necessarily before all of them) Some sort of internal event occuring causing these mutation storms or something which ends up synchronizing the compaction threads in a way that causes mutations storms like these? The messages went away a little while after reducing the throughput significantly to 6MB/sec... It does not seem to be a problem normally, just when doing something extreme like enabling multithreaded compaction when you have hundreds or thousands of memtables already. Regards, Terje