> The node (known as the "coordinating node" because it co-ordinates the > request submitted by the client) will send the request to the nodes > that are in the replica set for the row. The client need not care > about which host it connects to, other than that it be "one of the > ones in the correct cluster". > > Is this the source of confusion? There is never any issue with talking > to the "wrong" node outside of the replica set. The client just > doesn't care; Cassandra takes care of it.
That is not an issue that I feel confused about. I get the fact that cassandra will proxy the data transfer between the coordinating node and the node that actually has the data back to the client. I am not concerned about talking with the "wrong" node. I am concerned about the following scenario: N-1 nodes are down. This implies that only 1 node is online and available for communication. This node owns a specific token range. Yes, it can operate as a coordinating node, but all of the other nodes are offline, they do not matter. > It sounds like you're worried about data ending up on nodes that are > outside of the replica set, or that reads are done outside of the > replica set. This is not something you need to worry about. Just > submit your queries to the cluster and the appropriate nodes will be > serving the requests. I am worried that if only 1 node is active and online, and the other N-1 nodes are inactive, down, and offline, that the cluster will not be able to complete the operation, because not all of the data is available on the 1 node that is up.