On 3 Feb 2018 02:42, "Kyrylo Lebediev" <kyrylo_lebed...@epam.com> wrote:
Thanks, Oleksandr, In my case I'll need to replace all nodes in the cluster (one-by-one), so streaming will introduce perceptible overhead. My question is not about data movement/copy itself, but more about all this token magic. Okay, let's say we stopped old node, moved data to new node. Once it's started with auto_bootstrap=false it will be added to the cluster like an usual node, just skipping streaming stage, right? For a cluster with vnodes enabled, during addition of new node its token ranges are calculated automatically by C* on startup. So, how will C* know that this new node must be responsible for exactly the same token ranges as the old node was? How would the rest of nodes in the cluster ('peers') figure out that old node should be replaced in ring by the new one? Do you know about some limitation for this process in case of C* 2.1.x with vnodes enabled? A node stores its tokens and host id in the system.local table. Next time it starts up, it will use the same tokens as previously and the host id allows the rest of the cluster to see that it is the same node and ignore the IP address change. This happens regardless of auto_bootstrap setting. Try "select * from system.local" to see what is recorded for the old node. When the new node starts up it should log "Using saved tokens" with the list of numbers. Other nodes should log something like "ignoring IP address change" for the affected node addresses. Be careful though, to make sure that you put the data directory exactly where the new node expects to find it: otherwise it might just join as a brand new one, allocating new tokens. As a precaution it helps to ensure that the system user running the Cassandra process has no permission to create the data directory: this should stop the startup in case of misconfiguration. Cheers, -- Alex