You don’t have to double. 

You can add 1 node at a time - you just have to move every other token to stay balanced

Most people don’t write the tooling to do that, but it’s not that complicated

Calculate the token positions with N nodes
Calculate the token positions with N+1 nodes 

Bootstrap the new machine at whichever N+1 token is furthest from an existing token 
For each existing node: 
    Run cleanup 
    Move node to the new token 

Run cleanup again 

It’s involved but straight forward, online, and safe. 

Because there’s only one token per node you can bootstrap/move in batches (offset by 2x RF - so if you have 100 machines and RF=3, you can have 16 machines bootstrapping or moving at the same time). You can’t do that safely with vnodes. 


On Oct 9, 2024, at 12:51 AM, guo Maxwell <cclive1...@gmail.com> wrote:


I think cost is a very important point if you are going to use single token if your cluster will be very large , because every time the cluster is expanded, the nodes need to be doubled.100 -> 200, 200->400 ... 
This is one of the reasons why we maintain many small clusters.

of course its availability will be better . 

Abe Ratnofsky <a...@aber.io> 于2024年10月9日周三 11:56写道:

On Oct 7, 2024, at 17:30, Long Pan <panlong...@gmail.com> wrote:



Hi Cassandra Community,

I’m currently exploring the use of single vnode (single token) per node in large-scale Cassandra deployments. I've come across discussions suggesting that some heavy users like Apple and Netflix have opted for this configuration to simplify operations and achieve more predictable performance.

I’d like to ask if anyone could point me to resources (blog posts, conference talks, case studies or even personal experiences) that dive deeper into:

  • The rationale behind using a single vnode instead of multiple vnodes.
  • The operational benefits and any potential trade-offs encountered.

Thank you in advance for your insights and any pointers you can provide!

Best regards,
Long

Reply via email to