Howdi, We're using Cassandra 0.6.6 - intending to wait until 0.7 before we do any more upgrades.
We're running a cluster of 16 boxes of 7.1GB each, on Amazon EC2 using Ubuntu 10.04 (LTS). Today we saw one box kick its little feet up, and after investigating the other machines, it looks like they're all approaching the same fate. Over the past month or so, it looks like memory has slowly been exhausted. Both nodetool drain and jmap can't run, and produce this error: Error occurred during initialization of VM Could not reserve enough space for object heap We've got Xmx/Xms set to 4GB. top shows free memory around 50-80MB, file cache under 10MB, and the java process at 12+GB virt and 7.1GB res. This feels like a Java problem, not a Cassandra one, but I'm open to suggestions. To ensure I don't get bothered over the weekend we're doing a rolling restart of Cassandra on each of the boxes now. The last time they were restarted was just over a month ago. Now I'm wondering whether I should (until 0.7.1 is available) schedule in a slower rolling restart over several days, every few weeks. I've shared a Zabbix graph of system memory at: http://www.imagebam.com/image/3b4213110283969 cheers, Jedd.