http://wiki.apache.org/cassandra/ArchitectureInternals
2010/5/20 史英杰 <shiyingjie1...@gmail.com>: > What inner mechanism does Cassandra adopt to get this kind of fault > tolerance? > > 2010/5/20 Simon Smith <simongsm...@gmail.com> >> >> On Thu, May 20, 2010 at 8:08 AM, 史英杰 <shiyingjie1...@gmail.com> wrote: >> > Hi, All, >> > I am now learning the mechanism Cassandra adopts to get high >> > availability and fault tolerance. As I know, we should connect to one >> > server of Cassandra first, then we can read or write data through it, >> > so if >> > the server which we connect to get down, what will happen? Should we >> > have to >> > reconnect another server or will Cassandra control this situation? >> >> >> The approach we're taking is to put the software load-balancer haproxy >> in front of our cassandra cluster. Use "mode tcp" within haproxy's >> config. I notice that Tragedy (http://github.com/enki/tragedy/) also >> lets you put a list of servers into the connection call (we're going >> to put the list of haproxy load balancers here). >> >> >> >> > Another sutiation, if the server which is involved in the process of >> > data reading >> > fail, what will Cassandra do? >> >> >> If you're using Thrift to connect, catch the exceptions that library >> throws if unable to connect and then try to connect again. This is >> going to happen - if/when a node goes down it causes the entire >> cluster to hiccup a little, so if it is critical that any particular >> read transaction succeeds, you may need to sleep as much as 5 seconds >> (this is just my experience). >> >> >> > Thanks a lot! >> > >> > Yingjie > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com