That appears to be working correctly, but does not sound great. 

When the NTS selects replicas in a DC it orders the tokens available in  the 
DC, then (in the first pass) iterates through them placing a replica in each 
unique rack.  e.g. if the RF in each DC was 2, the replicas would be put on 2 
unique racks if possible. So the lowest token in the DC will *always* get a 
write.

It's not possible to load balance between the racks as there is no state shared 
between requests. A possible alternative would be to find the nearest token to 
the key and start allocating replicas from there. But as each DC contains only 
a part (say half) of the token range the likelihood is that half of the keys 
would match to either end of the DC's range so it would not be a great 
solution. 

I think what you are trying to achieve is not possible. Do you have the 
capacity to run RF 2 in each DC ? That would at least even things out.

Aaron
 

On 3 May 2011, at 06:40, Eric tamme wrote:

> I am experiencing an issue where replication is not being distributed
> between racks when using PropertyFileSnitch in conjunction with
> NetworkTopologyStrategy.
> 
> I am running 0.7.3 from a tar.gz on  cassandra.apache.org
> 
> I have 4 nodes, 2 data centers, and 2 racks in each data center.  Each
> rack has 1 node.
> 
> I have even token distribution so that each node gets 25%:
> 
> 0
> 425352958651173079329218259289
> 71026432
> 85070591730234615865843651857942052864
> 127605887595351923798765477786913079296
> 
> My cassandra-topology.properties is as follows:
> 
> # Cassandra Node IP=Data Center:Rack
> ffff\:0\:ffff\:eeee\:\:fffe=NY1:RAC1
> ffff\:0\:ffff\:eeee\:\:ffff=NY1:RAC2
> 
> ffff\:0\:ffff\:ffff\:\:fffe=LA1:RAC1
> ffff\:0\:ffff\:ffff\:\:ffff=LA1:RAC2
> 
> # default for unknown nodes
> default=NY1:RAC1
> 
> 
> My Keyspace replication strategy is as follows:
> Keyspace: SipTrace:
>  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
>    Options: [LA1:1,NY1:1]
> 
> So each data center should get 1 copy of the data, and this does
> happen.  The problem is that the replicated copies get pinned to the
> first host configured in the properties file, from what I can discern,
> and DO NOT distribute between racks.  So I have 2 nodes that have a 4
> to 1 ratio of data compared to the other 2 nodes.  This is a problem!
> 
> Can any one please tell me if I have misconfigured this?  Or how I can
> get replica data to distribute evenly between racks within a
> datacenter?  I was led to believe that cassandra will try to
> distribute between racks for replica data automatically under this
> setup.
> 
> Thank you for your help in advance!
> 
> -Eric

Reply via email to