The scenario you describe is the typical point where people move away from
vnodes and towards single-token-per-node (or a much smaller number of
vnodes).

The default setting puts you in a situation where virtually all hosts are
adjacent/neighbors to all others (at least until you're way into the
hundreds of hosts), which means you'll stream from nearly all hosts. If you
drop the number of vnodes from ~256 to ~4 or ~8 or ~16, you'll see the
number of streams drop as well.

Many people with "large" clusters statically allocate tokens to make it
predictable - if you have a single token per host, you can add multiple
hosts at a time, each streaming from a small number of neighbors, without
overlap.

It takes a bit more tooling (or manual token calculation) outside of
cassandra, but works well in practice for "large" clusters.




On Tue, Feb 20, 2018 at 4:42 AM, Jürgen Albersdorfer <
jalbersdor...@gmail.com> wrote:

> Hi, I'm wondering if it is possible resp. would it make sense to limit
> concurrent streaming when joining a new node to cluster.
>
> I'm currently operating a 15-Node C* Cluster (V 3.11.1) and joining
> another Node every day.
> The 'nodetool netstats' shows it always streams data from all other nodes.
>
> How far will this scale? - What happens when I have hundrets or even
> thousends of Nodes?
>
> Has anyone experience with such a Situation?
>
> Thanks, and regards
> Jürgen
>

Reply via email to