On Tue, Sep 14, 2010 at 10:43 AM, Chris Jansen <chris.jan...@cognitomobile.com> wrote: > Hi All, > > > > I’m a newbie to Cassandra so I could have a configuration issue here, I am > using the latest stable release 0.6.0. > > > > I have created a cluster of 3 nodes, a keyspace with RF=2 and a rack unaware > replication strategy. When I write with CL=QUORUM with all 3 nodes commit > the data fine, but when I write with the same CL with one of the nodes down > I see an UnavailableException thrown. Surely if one of the nodes in the > cluster is down another should acknowledge the writes and maintain the > quorum, or is there something that I have misunderstood? From what I > understand, in this case with a RF=2 for the quorum writes to succeed I need > two nodes to acknowledge the write (RF/2+1), which I have.
RF=2 means that each row is replicated on 2 of your nodes. As you said, Quorum is then 2. This means that for a quorum operation to succeed, you need that the 2 nodes out of the 2 that holds the row (*not* 2 out of all the nodes) be alive. To say it otherwise, if *any* of your node is dead, some operation will fail with unavailable exception. That is, quorum support a node being down only starting at RF=3. > > > > Here is how the cluster looks when quorum writes succeed: > > > > 192.168.245.2 Up 477.33 KB > 78502309573904554351249603414557542595 |<--| > > 192.168.245.4 Up 426.74 KB > 139625953069891725539207365034742863768 | | > > 192.168.245.1 Up 496.67 KB > 163572901304139170217093255272499595459 |-->| > > > > This is how it looks with one node down and quorum writes fail (I am writing > to 192.168.245.1): > > > > 192.168.245.2 Down 423.58 KB > 78502309573904554351249603414557542595 |<--| > > 192.168.245.4 Up 426.74 KB > 139625953069891725539207365034742863768 | | > > 192.168.245.1 Up 496.67 KB > 163572901304139170217093255272499595459 |-->| > > > > Here is the exception that is thrown: > > > > Cannot write: 9e48b039-7687-4b14-9b40-0096b15fd7b0 RETRYING > > UnavailableException() > > at > org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:12303) > > at > org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:675) > > at > org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:648) > > at cassandraclient.Main.writeReadDelete(Main.java:101) > > at cassandraclient.Main.run(Main.java:188) > > at java.lang.Thread.run(Thread.java:619) > > > > If I switch CL=ONE the writes succeed, but I don’t know if the data is being > replicated. Whatever the consistency level you use for a write, the data is always replicated unless some error occurs. The difference being whether the write waits to see if an error occurs or not. -- Sylvain > > > > Any help would be greatly appreciated, thanks. > > > > Chris Jansen > > > NOTICE: Cognito Limited. Benham Valence, Newbury, Berkshire, RG20 8LU. UK. > Company number 02723032. This e-mail message and any attachment is > confidential. It may not be disclosed to or used by anyone other than the > intended recipient. If you have received this e-mail in error please notify > the sender immediately then delete it from your system. Whilst every effort > has been made to check this mail is virus free we accept no responsibility > for software viruses and you should check for viruses before opening any > attachments. Opinions, conclusions and other information in this email and > any attachments which do not relate to the official business of the company > are neither given by the company nor endorsed by it. > > This email message has been scanned for viruses by Mimecast