The way that I understand it (and that seems to be consistent with what was said in this discussion) is that each DC has its own data space. Using your simplified 1-10 system: DC1 DC2 0 D1R1 D2R2 1 D1R1 D2R1 2 D1R1 D2R1 3 D1R1 D2R1 4 D1R1 D2R1 5 D1R2 D2R1 6 D1R2 D2R2 7 D1R2 D2R2 8 D1R2 D2R2 9 D1R2 D2R2
Each node is responsible for half of the ring in its own DC. ----- Original Message ----- From: "Eric tamme" <eta...@gmail.com> To: user@cassandra.apache.org Sent: Wednesday, May 4, 2011 1:58:19 PM Subject: Re: Replica data distributing between racks > Jonathan is suggesting the approach Jeremiah was using. > > Calculate the tokens the nodes in each DC independantly, and then add > 1 to the tokens if there are two nodes with the same tokens. > > In your case with 2 DC's with 2 nodes each. > > In DC 1 > node 1 = 0 > node 2 = 85070591730234615865843651857942052864 > > In DC 2 > node 1 = 1 > node 2 = 85070591730234615865843651857942052865 > > This will evenly distribute the keys in each DC, which is what the > NetworkTopologyStrategy is trying to do. Okay - I appreciate the direct solution, but I am still really confused. I think I am missing some thing conceptual here... it just isn't "clicking". If I have nodes 4 nodes, in two data centers, each in it's own rack: DC1R1, DC1R2, DC2R1, DC2R2 Tokens: DC1R1: N DC1R2: M DC2R1: N+1 DC2R1: M+1 Who is responsible for what in primary distribution and in replication? Is DC1R2 responsible for M-M+1 (aka 1 token, M)??? that doesn't make any sense... or am I supposed to be making primary distribution uneven so that the uneven replication then balances it? I am trying to conceptualize this... I drew up a graph of the range responsibility based on this token assignment based on a simplified token range of 0-9 http://dl.dropbox.com/u/19254184/tokens.jpg I must be missing some thing, I just don't know what. Please if some one can please explain or point me to resources that clearly explain this. Thanks for everyones time -Eric