Re: Confusion regarding the terms "replica" and "replication factor"

David Fischer Tue, 29 May 2012 14:25:28 -0700

Ok now i am confused :),

ok if i have the following
placement_strategy = 'NetworkTopologyStrategy'  and strategy_options =
{DC1:R1,DC2:R1,DC3:R1 }


this means in each of my datacenters i will have one full replica that
also can be seed node?
if i have 3 node in addition to the DC replica's with normal token
calculations a key can be in any datacenter plus on each of the
replicas right?
It will show 12 nodes total in its ring

On Thu, May 24, 2012 at 2:39 AM, aaron morton <[email protected]> wrote:
> This is partly historical. NTS (as it is now) has not always existed and was 
> not always the default. In days gone by used to be a fella could run a mighty 
> fine key-value store using just a Simple Replication Strategy.
>
> A different way to visualise it is a single ring with a Z axis for the DC's. 
> When you look at the ring from the top you can see all the nodes. When you 
> look at it from the side you can see the nodes are on levels that correspond 
> to their DC. Simple Strategy looks at the ring from the top. NTS works 
> through the layers of the ring.
>
>> If the hierarchy is Cluster ->
>> DataCenter -> Node, why exactly do we need globally unique node tokens
>> even though nodes are at the lowest level in the hierarchy.
> Nodes having a DC is a feature of *some* snitches and utilised by the *some* 
> of the replication strategies (and by the messaging system for network 
> efficiency). For background, mapping from row tokens to nodes is based on 
> http://en.wikipedia.org/wiki/Consistent_hashing
>
> Hope that helps.
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 24/05/2012, at 1:07 AM, java jalwa wrote:
>
>> Thanks Aaron. That makes things clear.
>> So I guess the 0 - 2^127 range for tokens corresponds to a cluster
>> -level top-level ring. and then you add some logic on top of that with
>> NTS to logically segment that range into sub-rings as per the notion
>> of data clusters defined in NTS. Whats the advantage of having a
>> single top-level ring ? intuitively it seems like each replication
>> group could have a separate ring so that the same tokens can be
>> assigned to nodes in different DC. If the hierarchy is Cluster ->
>> DataCenter -> Node, why exactly do we need globally unique node tokens
>> even though nodes are at the lowest level in the hierarchy.
>>
>> Thanks again.
>>
>>
>> On Wed, May 23, 2012 at 3:14 AM, aaron morton <[email protected]> 
>> wrote:
>>>> Now if a row key hash is mapped to a range owned by a node in DC3,
>>>> will the Node in DC3 still store the key as determined by the
>>>> partitioner and then walk the ring and store 2 replicas each in DC1
>>>> and DC2 ?
>>> No, only nodes in the DC's specified in the NTS configuration will be 
>>> replicas.
>>>
>>>> Or will the co-ordinator node be aware of the
>>>> replica placement strategy,
>>>> and override the partitioner's decision and walk the ring until it
>>>> first encounters a node in DC1 or DC2 ? and then place the remaining
>>>> replicas ?
>>> The NTS considers each DC to have it's own ring. This can make token 
>>> selection in a multi DC environment confusing at times. There is something 
>>> in the DS docs about it.
>>>
>>> Cheers
>>>
>>> -----------------
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>>
>>> On 23/05/2012, at 3:16 PM, java jalwa wrote:
>>>
>>>> Hi all,
>>>>              I am a bit confused regarding the terms "replica" and
>>>> "replication factor". Assume that I am using RandomPartitioner and
>>>> NetworkTopologyStrategy for replica placement.
>>>> From what I understand, with a RandomPartitioner, a row key will
>>>> always be hashed and be stored on the node that owns the range to
>>>> which the key is mapped.
>>>> http://www.datastax.com/docs/1.0/cluster_architecture/replication#networktopologystrategy.
>>>> The example here, talks about having 2 data centers and a replication
>>>> factor of 4 with 2 replicas in each datacenter, so the strategy is
>>>> configured as DC1:2 and DC2:2. Now suppose I add another datacenter
>>>> DC3, and do not change the NetworkTopologyStrategy.
>>>> Now if a row key hash is mapped to a range owned by a node in DC3,
>>>> will the Node in DC3 still store the key as determined by the
>>>> partitioner and then walk the ring and store 2 replicas each in DC1
>>>> and DC2 ? Will that mean that I will then have 5 replicas in the
>>>> cluster and not 4 ? Or will the co-ordinator node be aware of the
>>>> replica placement strategy,
>>>> and override the partitioner's decision and walk the ring until it
>>>> first encounters a node in DC1 or DC2 ? and then place the remaining
>>>> replicas ?
>>>>
>>>> Thanks.
>>>
>

Re: Confusion regarding the terms "replica" and "replication factor"

Reply via email to