You've enumerated the options and tradeoffs correctly. I've personally seen both implemented, and they're both fine.
With option 1, there's also an option that you don't just do "primary range" based repairs, but rather, let a scheduler run through the token range, and use any replica in any DC (basically randomizing). Doing that removes the extra bias for the first DC in repair coordination. In a future where cassandra coordinates repair itself (see recent CEP proposals), this should be something that happens automatically in the next ~year or so. On 2024/11/20 23:44:18 Long Pan wrote: > Dear Cassandra Community, > > I’m currently exploring the use of *single token per node* in large-scale > Cassandra deployments and have questions regarding *token assignment > strategies* in the context of multiple datacenters using > NetworkTopologyStrategy (RF=3 per dc). > > For horizontal scaling, I’m planning to adopt a *“100% expansion and 50% > shrinkage” strategy*, as it avoids token movement and simplifies operations. > > *Approach 1:* Small Offset Between Adjacent Tokens > [image: image.png] > > An intuitive approach is to set a *small offset* between adjacent tokens > from different datacenters (e.g., dc1 in green and dc2 in red). This > minimizes disruption during scaling. > > However, my concern is about *primary range repair*. In this setup, > *green-to-red > token ranges* are much smaller than *red-to-green token ranges*, meaning *dc1 > (green) nodes* will consistently bear higher repair coordination workloads > compared to *dc2 (red) nodes*. Since we run primary range repair regularly, > this imbalance could lead to unbalanced hardware resource usage between the > datacenters. > *Approach 2:* Balanced Token Ranges > [image: image.png] > > An alternative is to balance every token range to ensure fairness in repair > workloads. However, during cluster expansion/shrinkage, some existing > tokens will need to move. For example, in the diagram, for expansion, while > green tokens remain fixed during expansion, *all red tokens must move*. > This could result in *mass token movement* for large clusters and the need > for cleanups, which seems operationally heavy and complex. > *Question to the Community:* > > 1. Is my concern about the repair workload imbalance in Approach 1 > valid, or are there mitigating factors I’m overlooking? > 2. If you've faced similar challenges, what token assignment strategies > have worked well in multi-datacenter setups with single-token nodes? > > Looking forward to hearing your insights and experiences! > > Best regards, > Long >