Cassandra behavior too fragile?

Maxim Potekhin Wed, 07 Dec 2011 18:26:41 -0800

OK, thanks to the excellent help of Datastax folks, some of the moresevere inconsistencies in my Cassandra cluster were fixed (after a nodewas down and compactions failed etc).


I'm still having problems as reported in "repairs 0.8.6." thread.

Thing is, why is it so easy for the repair process to break? OK, I admitI'm not sure why nodes are reported as "dead" once in a while, but it'sabsolutely certain that they simply don't fall off the edge, are knockedout for 10 min or anything like that. Why is there no built-intolerance/retry mechanism so that a node that may seem silent for aminute can be contacted later, or, better yet, a different node with arelevant replica is contacted?

As was evident from some presentations at Cassandra-NYC yesterday,failed compactions and repairs are a major problem for a number ofusers. The cluster can quickly become unusable. I think it would be agood idea to build more robustness into these procedures,


Regards

Maxim

Cassandra behavior too fragile?

Reply via email to