Hi all, I'm experiencing some problems after 3 years of cassandra in production (from 0.6 to 1.0.6) -- for 2 times in 3 weeks 2 nodes crashed with OutOfMemory Exception. In the log I can read the warn about the few heap available ... now I'm increasing a little bit my RAM, my Java Heap (1/4 of the RAM) and reducing the size of rows and memtables thresholds. Other tips?
Now a question -- why with 2 nodes offline all my application stop providing the service, even when a Consistency Level One read is invoked? I'd expected this behaviour: CL1 operations keep working more than 80% of CLQ operations working (nodes offline where 2 and 5 in a clockwise key distribution only writes to fifth node should impact to node 2) most of all CLALL operations (that I don't use) failing The situation instead was that I had ALL services stop responding throwing a TTransportException ... Thanks in advance Carlo