> Is the above understanding correct ? yes, sorry. Cheers
----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 13/07/2012, at 4:24 PM, prasenjit mukherjee wrote: > On Fri, Jul 13, 2012 at 4:04 AM, aaron morton <aa...@thelastpickle.com> wrote: >> The logic is here >> https://github.com/apache/cassandra/blob/cassandra-1.1/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java#L78 > > Thanks Aaron for pointing to the code. > >> >> a. n>r : I am assuming, have 1 replica in each rack. >> >> You have 1 replica in the first n racks. >> >> b. n<r : ?? I am assuming, try to equally distribute replicas across >> in each racks. >> >> int(n/r) racks will have the same number of replicas. n % r will have more. > > Did you mean r%n ( since r>n) ? > > Shouldn't the logic be : all racks will have at least int(r/n) and r%n > will have 1 additional replica ? > > Sample use case ( r = 8, n = 3 ) > n1 : 3 ( 2+1 ) > n2: 3 ( 2+1 ) > n3: 2 > > Is the above understanding correct ? > > -Thanks, > Prasenjit > >> >> This is why multi rack replication can be tricky. >> >> Hope that helps. >> >> >> ----------------- >> Aaron Morton >> Freelance Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 12/07/2012, at 8:05 PM, prasenjit mukherjee wrote: >> >> Thanks. Some follow up questions : >> >> 1. How do the reads use strategy/snitch information ? I am assuming >> the reads can go to any of the replicas. WIll it also use the >> snitch/strategy info to find next 'R' replicas 'closest' to >> coordinator-node ? >> >> 2. In a single DC ( with n racks and r replicas ) what algorithm >> cassandra uses to write its replicas in following scenarios : >> a. n>r : I am assuming, have 1 replica in each rack. >> b. n<r : ?? I am assuming, try to equally distribute replicas across >> in each racks. >> >> -Thanks, >> Prasenjit >> >> On Thu, Jul 12, 2012 at 11:24 AM, Tyler Hobbs <ty...@datastax.com> wrote: >> >> I highly recommend specifying the same rack for all nodes (using >> >> cassandra-topology.properties) unless you really have a good reason not too >> >> (and you probably don't). The way that replicas are chosen when multiple >> >> racks are in play can be fairly confusing and lead to a data imbalance if >> >> you don't catch it. >> >> >> >> On Wed, Jul 11, 2012 at 10:53 PM, prasenjit mukherjee <prasen....@gmail.com> >> >> wrote: >> >> >> As far as I know there isn't any way to use the rack name in the >> >> strategy_options for a keyspace. You >> >> might want to look at the code to dig into that, perhaps. >> >> >> Aha, I was wondering if I could do that as well ( specify rack options ) >> >> :) >> >> >> Thanks for the pointer, I will dig into the code. >> >> >> -Thanks, >> >> Prasenjit >> >> >> On Thu, Jul 12, 2012 at 5:33 AM, Richard Lowe <richard.l...@arkivum.com> >> >> wrote: >> >> If you then specify the parameters for the keyspace to use these, you >> >> can control exactly which set of nodes replicas end up on. >> >> >> For example, in cassandra-cli: >> >> >> create keyspace ks1 with placement_strategy = >> >> 'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options >> >> = { DC1_realtime: 2, DC1_analytics: 1, DC2_realtime: 1 }; >> >> >> As far as I know there isn't any way to use the rack name in the >> >> strategy_options for a keyspace. You might want to look at the code to dig >> >> into that, perhaps. >> >> >> Whichever snitch you use, the nodes are sorted in order of proximity to >> >> the client node. How this is determined depends on the snitch that's used >> >> but most (the ones that ship with Cassandra) will use the default ordering >> >> of same-node < same-rack < same-datacenter < different-datacenter. Each >> >> snitch has methods to tell Cassandra which rack and DC a node is in, so it >> >> always knows which node is closest. Used with the Bloom filters this can >> >> tell us where the nearest replica is. >> >> >> >> >> -----Original Message----- >> >> From: prasenjit mukherjee [mailto:prasen....@gmail.com] >> >> Sent: 11 July 2012 06:33 >> >> To: user >> >> Subject: How to come up with a predefined topology >> >> >> Quoting from >> >> http://www.datastax.com/docs/0.8/cluster_architecture/replication#networktopologystrategy >> >> : >> >> >> "Asymmetrical replication groupings are also possible depending on your >> >> use case. For example, you may want to have three replicas per data center >> >> to serve real-time application requests, and then have a single replica in a >> >> separate data center designated to running analytics." >> >> >> Have 2 questions : >> >> 1. Any example how to configure a topology with 3 replicas in one DC ( >> >> with 2 in 1 rack + 1 in another rack ) and one replica in another DC ? >> >> The default networktopologystrategy with rackinferringsnitch will only >> >> give me equal distribution ( 2+2 ) >> >> >> 2. I am assuming the reads can go to any of the replicas. Is there a >> >> client which will send query to a node ( in cassandra ring ) which is >> >> closest to the client ? >> >> >> -Thanks, >> >> Prasenjit >> >> >> >> >> >> >> >> -- >> >> Tyler Hobbs >> >> DataStax >> >> >>