Re: How to come up with a predefined topology

aaron morton Mon, 16 Jul 2012 03:06:13 -0700

> Is the above understanding correct ?
yes, sorry.

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 13/07/2012, at 4:24 PM, prasenjit mukherjee wrote:

> On Fri, Jul 13, 2012 at 4:04 AM, aaron morton <aa...@thelastpickle.com> wrote:
>> The logic is here
>> https://github.com/apache/cassandra/blob/cassandra-1.1/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java#L78
> 
> Thanks Aaron for pointing to the code.
> 
>> 
>> a. n>r : I am assuming, have 1 replica in each rack.
>> 
>> You have 1 replica in the first n racks.
>> 
>> b. n<r : ?? I am assuming, try to equally distribute replicas across
>> in each racks.
>> 
>> int(n/r) racks will have the same number of replicas. n % r will have more.
> 
> Did you mean  r%n ( since r>n)  ?
> 
> Shouldn't the logic be : all racks will have at least int(r/n) and r%n
> will have 1 additional replica ?
> 
> Sample use case ( r = 8, n = 3 )
> n1 : 3 ( 2+1 )
> n2:  3 ( 2+1 )
> n3:  2
> 
> Is the above understanding correct ?
> 
> -Thanks,
> Prasenjit
> 
>> 
>> This is why multi rack replication can be tricky.
>> 
>> Hope that helps.
>> 
>> 
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 12/07/2012, at 8:05 PM, prasenjit mukherjee wrote:
>> 
>> Thanks. Some follow up questions :
>> 
>> 1.  How do the reads use strategy/snitch information ? I am assuming
>> the reads can go to any of the replicas. WIll it also use the
>> snitch/strategy info to find next 'R' replicas 'closest' to
>> coordinator-node ?
>> 
>> 2. In a single DC ( with n racks and r replicas ) what algorithm
>> cassandra uses to write its replicas in following scenarios :
>> a. n>r : I am assuming, have 1 replica in each rack.
>> b. n<r : ?? I am assuming, try to equally distribute replicas across
>> in each racks.
>> 
>> -Thanks,
>> Prasenjit
>> 
>> On Thu, Jul 12, 2012 at 11:24 AM, Tyler Hobbs <ty...@datastax.com> wrote:
>> 
>> I highly recommend specifying the same rack for all nodes (using
>> 
>> cassandra-topology.properties) unless you really have a good reason not too
>> 
>> (and you probably don't).  The way that replicas are chosen when multiple
>> 
>> racks are in play can be fairly confusing and lead to a data imbalance if
>> 
>> you don't catch it.
>> 
>> 
>> 
>> On Wed, Jul 11, 2012 at 10:53 PM, prasenjit mukherjee <prasen....@gmail.com>
>> 
>> wrote:
>> 
>> 
>> As far as I know there isn't any way to use the rack name in the
>> 
>> strategy_options for a keyspace. You
>> 
>> might want to look at the code to dig into that, perhaps.
>> 
>> 
>> Aha, I was wondering if I could do that as well ( specify rack options )
>> 
>> :)
>> 
>> 
>> Thanks for the pointer, I will dig into the code.
>> 
>> 
>> -Thanks,
>> 
>> Prasenjit
>> 
>> 
>> On Thu, Jul 12, 2012 at 5:33 AM, Richard Lowe <richard.l...@arkivum.com>
>> 
>> wrote:
>> 
>> If you then specify the parameters for the keyspace to use these, you
>> 
>> can control exactly which set of nodes replicas end up on.
>> 
>> 
>> For example, in cassandra-cli:
>> 
>> 
>> create keyspace ks1 with placement_strategy =
>> 
>> 'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options
>> 
>> = { DC1_realtime: 2, DC1_analytics: 1, DC2_realtime: 1 };
>> 
>> 
>> As far as I know there isn't any way to use the rack name in the
>> 
>> strategy_options for a keyspace. You might want to look at the code to dig
>> 
>> into that, perhaps.
>> 
>> 
>> Whichever snitch you use, the nodes are sorted in order of proximity to
>> 
>> the client node. How this is determined depends on the snitch that's used
>> 
>> but most (the ones that ship with Cassandra) will use the default ordering
>> 
>> of same-node < same-rack < same-datacenter < different-datacenter. Each
>> 
>> snitch has methods to tell Cassandra which rack and DC a node is in, so it
>> 
>> always knows which node is closest. Used with the Bloom filters this can
>> 
>> tell us where the nearest replica is.
>> 
>> 
>> 
>> 
>> -----Original Message-----
>> 
>> From: prasenjit mukherjee [mailto:prasen....@gmail.com]
>> 
>> Sent: 11 July 2012 06:33
>> 
>> To: user
>> 
>> Subject: How to come up with a predefined topology
>> 
>> 
>> Quoting from
>> 
>> http://www.datastax.com/docs/0.8/cluster_architecture/replication#networktopologystrategy
>> 
>> :
>> 
>> 
>> "Asymmetrical replication groupings are also possible depending on your
>> 
>> use case. For example, you may want to have three replicas per data center
>> 
>> to serve real-time application requests, and then have a single replica in a
>> 
>> separate data center designated to running analytics."
>> 
>> 
>> Have 2 questions :
>> 
>> 1. Any example how to configure a topology with 3 replicas in one DC (
>> 
>> with 2 in 1 rack + 1 in another rack ) and one replica in another DC ?
>> 
>> The default networktopologystrategy with rackinferringsnitch will only
>> 
>> give me equal distribution ( 2+2 )
>> 
>> 
>> 2. I am assuming the reads can go to any of the replicas. Is there a
>> 
>> client which will send query to a node ( in cassandra ring ) which is
>> 
>> closest to the client ?
>> 
>> 
>> -Thanks,
>> 
>> Prasenjit
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> 
>> Tyler Hobbs
>> 
>> DataStax
>> 
>> 
>>

Re: How to come up with a predefined topology

Reply via email to