Did the Cassandra cluster go down or did you start getting failures from the client when it routed queries to the downed node? The key in the client is to keep working around the ring if the initial node is down.
--Joe On Apr 9, 2011, at 12:52 PM, Vram Kouramajian wrote: > We have a 5 Cassandra nodes with the following configuration: > > Casandra Version: 0.6.11 > Number of Nodes: 5 > Replication Factor: 3 > Client: Hector 0.6.0-14 > Write Consistency Level: Quorum > Read Consistency Level: Quorum > Ring Topology: > Owns Range Ring > > 132756707369141912386052673276321963528 > 192.168.89.153Up 4.15 GB 33.87% > 20237398133070283622632741498697119875 |<--| > 192.168.89.155Up 5.17 GB 18.29% > 51358066040236348437506517944084891398 | ^ > 192.168.89.154Up 7.41 GB 33.97% > 109158969152851862753910401160326064203 v | > 192.168.89.152Up 5.07 GB 6.34% > 119944993359936402983569623214763193674 | ^ > 192.168.89.151Up 4.22 GB 7.53% > 132756707369141912386052673276321963528 |-->| > > We believe that our setup should survive the crash of one of the > Cassandra nodes. But, we had few crashes and the system stopped > functioning until we brought back the Cassandra nodes. > > Any clues? > > Vram