I followed your advice and install a 3 m1.small instance cluster. The problem is still there. I've got less timeouts because I have less compaction due to a bigger amount of memory usable before flushing, but when a compaction starts, I can reach 95% of the cpu used, which produces timeouts. The compaction run faster, so I have less time out but they are still some.
Is there really no way to turn compaction into a background and low CPU consumption task ? What kind of information can I give you to help you understanding what is going on with these timeouts ? 2011/11/15 Dan Hendry <dan.hendry.j...@gmail.com> > I really don’t recommend using t1.micros. The problem with them is that > they have CPU bursting, basically meaning you get lots of CPU resources for > a short time but if you use more than you have been allocated you get > basically nothing for 10+ seconds afterwards. By ‘basically nothing’ I > really mean that – the machine is effectively dead. The biggest problem > with this (which we found out the hard way, within a test environment > thankfully) is that it makes capacity planning extremely difficult – the > line between having a cluster with sufficient capacity and being overloaded > is extremely abrupt and very difficult to see coming. Moreover once you are > over capacity, the ‘dead periods caused’ by CPU bursting cause things > spiral out of control rapidly due to overtly aggressive client retries and > hinted handoff increasing overall load (although the HH problem might have > improved with 1.0.x). I would recommend m1.smalls at the very least.**** > > ** ** > > If you are set on micros, make sure you only ever trigger compaction on > one node at a time (or better, consider if you even need to trigger major > compactions at all), set compaction_throughput_mb_per_sec (cassandra.yaml) > as low as you possibly can (1 is the minimum I believe), try disabling > hinted handoff (on all nodes), and use lower read/write consistency levels > if you can.**** > > ** ** > > Dan**** > > ** ** > > *From:* Alain RODRIGUEZ [mailto:arodr...@gmail.com] > *Sent:* November-15-11 6:34 > *To:* user@cassandra.apache.org > *Subject:* Compaction -> CPU load 100% -> time out**** > > ** ** > > Hi, I'm running a 3 node cassandra 1.0.2 cluster on 3 Amazon EC2 t1.micro. > **** > > ** ** > > I managed to fix some OOM I had, but I still have some spike of cpu load.* > *** > > ** ** > > I know that t1.micro have small resources, but I think it could be enough > if they were well managed.**** > > ** ** > > My application works well, excepted when cassandra need to run a > compaction on a node. To do it, Cassandra uses 100% of the cpu, generating > a lot of time out. My time out is configured to 250 ms with 2 attempt max. > I'm running in production, our actual system use MySQL and we are trying to > replace MySQLwith Cassandra. Cassandra musn't slow down the production > environnement while we use both DB in parallel, that is why I can't > increase the time before a time out.**** > > ** ** > > Running this compaction in background somehow could be a good idea, after > my seach about this subject, I tried by adding JVM_OPTS="$JVM_OPTS > -Dcassandra.compaction.priority=1" to the cassandra-env.sh**** > > ** ** > > This option was added for Cassandra 0.6.3, is it still usefull ? It > doesn't resolve my problem.**** > > ** ** > > Anyways, this doesn't help while performing a nodetool repair, the cpu > load is still 100%.**** > > ** ** > > Is there a way to turn these exceptional tasks into backgrounds tasks, > using only available cpu ?**** > > ** ** > > Is there a way to get Cassandra working properly on EC2 t1.micros ?**** > > ** ** > > Thanks,**** > > ** ** > > Alain**** > > No virus found in this incoming message. > Checked by AVG - www.avg.com > Version: 9.0.920 / Virus Database: 271.1.1/4017 - Release Date: 11/14/11 > 14:34:00**** >