> I have been having problems running 0.7RC2 where one of my two nodes > routinely goes down. Somtimes both of them go down. I am running the nodes > using Ubuntu Lucid LTS 64-bit with kernal version 2.6.32. Currently, both > nodes are running on micro instances on EC2. I will eventual migrate to > large instance...but I can't seem to get Cassandra to stay up for more than > 1 day at a time
We need more information. What does "go down" mean? Does the JVM get killed? Does it stop responding to request but remains running? if the latter, what does it do - does it spin CPU? What *does* show up in eg the cassandra system log when this happens (error messages or not)? You're saying you're running on micro instances. Have you configured your node appropriately, in particular memory thresholds? If using the out-of-the-box config on a micro instance (isn't that like 512 mb?), the max heap size will probably be 256 mb. And with out-of-the-box cassandra.yaml settings I would not be surprised if you're dying with an OutOfMemory error and possibly with high amounts of GC activity before actually dying. -- / Peter Schuller