Hi all,
I'm just trying to understand better how cassandra works.
My understanding is that, once set, the number of vnodes does not change in
a cluster. The partitioner allocates vnodes to nodes ensuring replication
data are not stored on the same node.
But what happens if there are more nodes t
Hey,
num_tokens is tokens per node.
So in your case you would have 15 vnodes altogether.
Cheers,
Hannu
> On 15. Jun 2022, at 10.08, Luca Rondanini wrote:
>
> Hi all,
>
> I'm just trying to understand better how cassandra works.
>
> My understanding is that, once set, the number of vnodes d
ok, that makes sense, but does the partitioner add vnodes? is the number of
vnodes fixed in a cluster?
On Wed, Jun 15, 2022 at 12:10 AM Hannu Kröger wrote:
> Hey,
>
> num_tokens is tokens per node.
>
> So in your case you would have 15 vnodes altogether.
>
> Cheers,
> Hannu
>
> > On 15. Jun 2022
When a node joins a cluster, it gets (semi-)random tokens based on num_tokens
value.
Total amount of vnodes is not fixed. I don’t remember top of my hat if
num_tokens can be different on each node but whenever you add a node, new
vnodes get “created”. Existing token ranges will be split and som
Thanks a lot Hannu,
really helpful! But isn't that crazy expensive? adding a vnode means that
every vnode in the cluster will have a different range of tokens which
means a lot of data will need to be moved around.
Thanks again,
Luca
On Wed, Jun 15, 2022 at 12:25 AM Hannu Kröger wrote:
> Whe
Hi all,
Say we have 2 datacentres with 12 nodes in each. All hardware is the same.
4-core, 2 x HDD (eg, 4TiB)
num_tokens = 16 as a start point
If a plan is to gradually increase the nodes per DC, and new hardware will have
more of everything, especially storage, I assume I increase the num_tok
Adding a token (which in essence is a vnode) means that the token range that it
hits will be split into two. And that data range which has a new owner will be
replicated to the new owner node. If there are a lot of tokens (=vnodes) in the
cluster, adding some amount of vnodes (e.g. num_tokens=16
Awesome, thank you so much! I completely missed the part "the token range
that it hits will be split", now everything makes sense!
Again, thanks a lot for your help!
Luca
On Wed, Jun 15, 2022 at 1:04 AM Hannu Kröger wrote:
> Adding a token (which in essence is a vnode) means that the token ra
You shouldn't need to change num_tokens at all. num_tokens helps you
pretend your cluster is a bigger than it is and randomly selects tokens for
you so that your data is approximately evenly distributed. As you add more
hosts, it should balance out automatically.
The alternative to num_tokens is
If you set a different num_tokens value for new hosts (the value should
never be changed on an existing host), the amount of data moved to that
host will be proportional to the num_tokens value. So, if the new hosts
are set to 32 when they're added to the cluster, those hosts will get twice
as muc
Thanks for that info.
I did see in the documentation that a value of 16 was not recommended for >50
hosts. Our existing hbase is 76 regionservers so I would imagine that
(eventually) we will see a similar figure.
There will be some scenarios where an initial setup may have (eg) 2 x 8 HDD and
f
11 matches
Mail list logo