I have run clusters with different disk size nodes by using different number of num_tokens. I used the basic math of just increasing the num_tokens by the same percentage as change in disk size. (So, if my "normal" node was 8 tokens, one with double the disk space would be 16.)
One thing to watch/consider - the (number of tokens) * (the number of nodes) makes repairs work harder Sean R. Durity INTERNAL USE -----Original Message----- From: Marc Hoppins <marc.hopp...@eset.com> Sent: Wednesday, June 15, 2022 3:34 AM To: user@cassandra.apache.org Subject: [EXTERNAL] Configuration for new(expanding) cluster and new admins. Hi all, Say we have 2 datacentres with 12 nodes in each. All hardware is the same. 4-core, 2 x HDD (eg, 4TiB) num_tokens = 16 as a start point If a plan is to gradually increase the nodes per DC, and new hardware will have more of everything, especially storage, I assume I increase the num_tokens value. Should I have started with a lower value? What would be considered as a good adjustment for: Any increase in number of HDD for any node? Any increase in capacity per HDD for any node? Is there any direct correlation between new token count and the proportional increase in either quantity of devices or total capacity, or is any adjustment purely arbitrary just to differentiate between varied nodes? Thanks M