The problem is that i have defined too many racks in my cluster (because i have multiple Cassandra nodes on a single server, so i defined each physical server as a separate rack) and because i haven't heard of any rule of "one seed per rack" before the tlp article, (actually the only rule about seed node i had in my mind was: "3-4 seed nodes in the cluster is enough, more is unnecessary and nonperformant"), i set up my clusters with 3-4 seed nodes always.
I already have a cluster set-up with the wrong mechanism (just one seed node with initial_token and then just bootsrtapped other nodes one after another), and it seems to be working, it's almost balanced and when i unplug a whole rack, writes and reads are still working with no error (using CL=ONE). So what would be the problem? Is this catastrophic to not to use manual token on every seed node of any rack? I assume that when i define racks, whatever happens, Cassandra never put two copies of my data in a single rack? (Right now, its my main concern, because i'm OK with my cluster's balanced load) Sent using https://www.zoho.com/mail/ ---- On Mon, 06 May 2019 07:17:14 +0430 Anthony Grasso <anthony.gra...@gmail.com> wrote ---- Hi If you are planning on setting up a new cluster with allocate_tokens_for_keyspace, then yes, you will need one seed node per rack. As Jon mentioned in a previous email, you must manually specify the token range for each seed node. This can be done using the initial_token setting. The article you are referring to (https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html) includes python code which calculates the token ranges for each of the seed nodes. When calling that python code, you must specify the vnodes - number of token per node and the number of racks. Regards, Anthony On Sat, 4 May 2019 at 19:14, onmstester onmstester <mailto:onmstes...@zoho.com.invalid> wrote: I just read this article by tlp: https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html Noticed that: >>We will need to set the tokens for the seed nodes in each rack manually. This >>is to prevent each node from randomly calculating its own token ranges But until now, i was using this recommendation to setup a new cluster: >> You'll want to set them explicitly using: python -c 'print( [str(((2**64 / 4) * i) - 2**63) for i in range(4)])' After you fire up the first seed, create a keyspace using RF=3 (or whatever you're planning on using) and set allocate_tokens_for_keyspace to that keyspace in your config, and join the rest of the nodes. That gives even distribution. I've defined plenty of racks in my cluster (and only 3 seed nodes), should i have a seed node per rack and use initial_token for all of the seed nodes or just one seed node with inital_token would be ok? Best Regards