What inner mechanism does Cassandra adopt to get this kind of fault tolerance?
2010/5/20 Simon Smith <simongsm...@gmail.com> > On Thu, May 20, 2010 at 8:08 AM, 史英杰 <shiyingjie1...@gmail.com> wrote: > > Hi, All, > > I am now learning the mechanism Cassandra adopts to get high > > availability and fault tolerance. As I know, we should connect to one > > server of Cassandra first, then we can read or write data through it, so > if > > the server which we connect to get down, what will happen? Should we have > to > > reconnect another server or will Cassandra control this situation? > > > The approach we're taking is to put the software load-balancer haproxy > in front of our cassandra cluster. Use "mode tcp" within haproxy's > config. I notice that Tragedy (http://github.com/enki/tragedy/) also > lets you put a list of servers into the connection call (we're going > to put the list of haproxy load balancers here). > > > > > Another sutiation, if the server which is involved in the process of data > reading > > fail, what will Cassandra do? > > > If you're using Thrift to connect, catch the exceptions that library > throws if unable to connect and then try to connect again. This is > going to happen - if/when a node goes down it causes the entire > cluster to hiccup a little, so if it is critical that any particular > read transaction succeeds, you may need to sleep as much as 5 seconds > (this is just my experience). > > > > Thanks a lot! > > > > Yingjie >