Using PropertyFileSnitch you can fine tune the topology of the cluster. 

What you tell Cassandra about your "DC" and "rack" doesn't have to match how 
they are in real life. You can create virtual DCs for Cassandra and even treat 
each node as a separate rack.

For example, in cassandra-topology.properties:

# Format is <Node IP>=<DC Name>:<Rack Name>
192.168.0.11=DC1_realtime:node_1
192.168.0.12=DC1_realtime:node_2
192.168.0.13=DC1_analytics:node_3
192.168.1.11=DC2_realtime:node_1

If you then specify the parameters for the keyspace to use these, you can 
control exactly which set of nodes replicas end up on. 

For example, in cassandra-cli:

create keyspace ks1 with placement_strategy = 
'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options = { 
DC1_realtime: 2, DC1_analytics: 1, DC2_realtime: 1 };

As far as I know there isn't any way to use the rack name in the 
strategy_options for a keyspace. You might want to look at the code to dig into 
that, perhaps.

Whichever snitch you use, the nodes are sorted in order of proximity to the 
client node. How this is determined depends on the snitch that's used but most 
(the ones that ship with Cassandra) will use the default ordering of same-node 
< same-rack < same-datacenter < different-datacenter. Each snitch has methods 
to tell Cassandra which rack and DC a node is in, so it always knows which node 
is closest. Used with the Bloom filters this can tell us where the nearest 
replica is.



-----Original Message-----
From: prasenjit mukherjee [mailto:prasen....@gmail.com] 
Sent: 11 July 2012 06:33
To: user
Subject: How to come up with a predefined topology

Quoting from 
http://www.datastax.com/docs/0.8/cluster_architecture/replication#networktopologystrategy
:

"Asymmetrical replication groupings are also possible depending on your use 
case. For example, you may want to have three replicas per data center to serve 
real-time application requests, and then have a single replica in a separate 
data center designated to running analytics."

Have 2 questions :
1. Any example how to configure a topology with 3 replicas in one DC ( with 2 
in 1 rack + 1 in another rack ) and one replica in another DC ?
 The default networktopologystrategy with rackinferringsnitch will only give me 
equal distribution ( 2+2 )

2. I am assuming the reads can go to any of the replicas. Is there a client 
which will send query to a node ( in cassandra ring ) which is closest to the 
client ?

-Thanks,
Prasenjit


Reply via email to