Using PropertyFileSnitch you can fine tune the topology of the cluster. What you tell Cassandra about your "DC" and "rack" doesn't have to match how they are in real life. You can create virtual DCs for Cassandra and even treat each node as a separate rack.
For example, in cassandra-topology.properties: # Format is <Node IP>=<DC Name>:<Rack Name> 192.168.0.11=DC1_realtime:node_1 192.168.0.12=DC1_realtime:node_2 192.168.0.13=DC1_analytics:node_3 192.168.1.11=DC2_realtime:node_1 If you then specify the parameters for the keyspace to use these, you can control exactly which set of nodes replicas end up on. For example, in cassandra-cli: create keyspace ks1 with placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options = { DC1_realtime: 2, DC1_analytics: 1, DC2_realtime: 1 }; As far as I know there isn't any way to use the rack name in the strategy_options for a keyspace. You might want to look at the code to dig into that, perhaps. Whichever snitch you use, the nodes are sorted in order of proximity to the client node. How this is determined depends on the snitch that's used but most (the ones that ship with Cassandra) will use the default ordering of same-node < same-rack < same-datacenter < different-datacenter. Each snitch has methods to tell Cassandra which rack and DC a node is in, so it always knows which node is closest. Used with the Bloom filters this can tell us where the nearest replica is. -----Original Message----- From: prasenjit mukherjee [mailto:prasen....@gmail.com] Sent: 11 July 2012 06:33 To: user Subject: How to come up with a predefined topology Quoting from http://www.datastax.com/docs/0.8/cluster_architecture/replication#networktopologystrategy : "Asymmetrical replication groupings are also possible depending on your use case. For example, you may want to have three replicas per data center to serve real-time application requests, and then have a single replica in a separate data center designated to running analytics." Have 2 questions : 1. Any example how to configure a topology with 3 replicas in one DC ( with 2 in 1 rack + 1 in another rack ) and one replica in another DC ? The default networktopologystrategy with rackinferringsnitch will only give me equal distribution ( 2+2 ) 2. I am assuming the reads can go to any of the replicas. Is there a client which will send query to a node ( in cassandra ring ) which is closest to the client ? -Thanks, Prasenjit