[ the following is some hefty speculation and not based on production experience with cassandra specifically ]
> *never* complete. And compactions are expensive. They essentially make a > node useless. So, we're left with 3/4 of a cluster, since we only have 4 > nodes. Do you have a feel for why the impact is so great in your particular case - is your read load high enough that the added I/O, even if it is sequential, tips it over - or are are reads just generally unusably slow for some reason? So far when I've done benchmarking (still not used cassandra in production) compaction has tended to be CPU bound with smallish values, such that I have not expected the I/O impact to be unmanagable (I am assuming that if it becomes unusably slow it is because of disk I/O rather than CPU - right?). I've been worried about its effects on latency-sensitive reads though, particularly when you start pushing underlying storage rather than CPU (have there been thoughts of rate limiting compaction speed?). > Since then, another node in the cluster has started queueing up compactions. > This is on pretty beefy hardware, too: > 2 x E5620, 24GB, 2 x 15kRPM SAS disks in RAID1 for data, and 1 x 7200RPM > SATA for commit logs. > I guess we need more nodes? But, we only have about 80GB total per node, > which doesn't really seem like that much for that kind of hardware? As far as I can tell the amount of data should only affect the total time/cpu spent on compactions; so while it is relevant in cases where compaction is not keeping up, it should not really help with mitigating the effects of compaction on other I/O. At least as long as the data size doesn't become so small it no longer thrashes caches etc. Is the read high enough that it completely screws up the I/O scheduling done by the OS so that you don't get sequential reading/writing speeds during compaction? Do you have any kind of comparison of the speed (in terms of MB/second) that compaction runs at when the nodes are idle vs. when they are taking read traffic? (I'm ignoring write traffic since that should mostly only be relevant for CPU.) Compaction speed is probably going to be very dependent on the O/S scheduling I/O in such a way that you get the full benefit of sequential I/O. Are the nodes under high memory pressure from the OS perspective such that perhaps the I/O buffers are reduced, making the normally sequential I/O seek bound? On an otherwise idle node you should either be CPU bound (in one thread) during compaction, or you should be seeing disk speeds close to platter speed (assuming the file system is not extremely fragmented). -- / Peter Schuller