The scenario you describe is the typical point where people move away from vnodes and towards single-token-per-node (or a much smaller number of vnodes).
The default setting puts you in a situation where virtually all hosts are adjacent/neighbors to all others (at least until you're way into the hundreds of hosts), which means you'll stream from nearly all hosts. If you drop the number of vnodes from ~256 to ~4 or ~8 or ~16, you'll see the number of streams drop as well. Many people with "large" clusters statically allocate tokens to make it predictable - if you have a single token per host, you can add multiple hosts at a time, each streaming from a small number of neighbors, without overlap. It takes a bit more tooling (or manual token calculation) outside of cassandra, but works well in practice for "large" clusters. On Tue, Feb 20, 2018 at 4:42 AM, Jürgen Albersdorfer < jalbersdor...@gmail.com> wrote: > Hi, I'm wondering if it is possible resp. would it make sense to limit > concurrent streaming when joining a new node to cluster. > > I'm currently operating a 15-Node C* Cluster (V 3.11.1) and joining > another Node every day. > The 'nodetool netstats' shows it always streams data from all other nodes. > > How far will this scale? - What happens when I have hundrets or even > thousends of Nodes? > > Has anyone experience with such a Situation? > > Thanks, and regards > Jürgen >