Yes, bootstrapping a new node will cause read loads on your existing nodes - it is becoming the owner and replica of a whole new set of existing data. To do that it needs to know what data it's now responsible for, and that's what bootstrapping is for.
If you're at the point where bootstrapping a new node is placing a too-heavy burden on your existing nodes, you may be dangerously close to or even past the tipping point where you ought to have already grown your cluster. You need to grow your cluster as soon as possible, and chances are you're close to no longer being able to keep up with compaction (see nodetool compactionstats, make sure pending tasks is <5, preferably 0 or 1). Once you're falling behind on compaction, it becomes difficult to successfully bootstrap new nodes, and you're in a very tough spot. On Wed, Jan 21, 2015 at 7:43 PM, Yatong Zhang <bluefl...@gmail.com> wrote: > Thanks for the reply. The bootstrap of new node put a heavy burden on the > whole cluster and I don't know why. So that' the issue I want to fix > actually. > > On Mon, Jan 12, 2015 at 6:08 AM, Eric Stevens <migh...@gmail.com> wrote: > >> Yes, but it won't do what I suspect you're hoping for. If you disable >> auto_bootstrap in cassandra.yaml the node will join the cluster and will >> not stream any old data from existing nodes. >> >> The cluster will now be in an inconsistent state. If you bring enough >> nodes online this way to violate your read consistency level (eg RF=3, >> CL=Quorum, if you bring on 2 nodes this way), some of your queries will be >> missing data that they ought to have returned. >> >> There is no way to bring a new node online and have it be responsible >> just for new data, and have no responsibility for old data. It *will* be >> responsible for old data, it just won't *know* about the old data it >> should be responsible for. Executing a repair will fix this, but only >> because the existing nodes will stream all the missing data to the new >> node. This will create more pressure on your cluster than just normal >> bootstrapping would have. >> >> I can't think of any reason you'd want to do that unless you needed to >> grow your cluster really quickly, and were ok with corrupting your old data. >> >> On Sat, Jan 10, 2015 at 12:39 AM, Yatong Zhang <bluefl...@gmail.com> >> wrote: >> >>> Hi there, >>> >>> I am using C* 2.0.10 and I was trying to add a new node to a >>> cluster(actually replace a dead node). But after added the new node some >>> other nodes in the cluster had a very high work-load and affected the whole >>> performance of the cluster. >>> So I am wondering is there a way to add a new node and this node only >>> afford new data? >>> >> >> >