Yes, my cluster is almost full and there are lots of pending tasks. You helped me a lot and thank you Eric~
On Thu, Jan 22, 2015 at 11:59 AM, Eric Stevens <migh...@gmail.com> wrote: > Yes, bootstrapping a new node will cause read loads on your existing nodes > - it is becoming the owner and replica of a whole new set of existing > data. To do that it needs to know what data it's now responsible for, and > that's what bootstrapping is for. > > If you're at the point where bootstrapping a new node is placing a > too-heavy burden on your existing nodes, you may be dangerously close to or > even past the tipping point where you ought to have already grown your > cluster. You need to grow your cluster as soon as possible, and chances > are you're close to no longer being able to keep up with compaction (see > nodetool compactionstats, make sure pending tasks is <5, preferably 0 or > 1). Once you're falling behind on compaction, it becomes difficult to > successfully bootstrap new nodes, and you're in a very tough spot. > > > On Wed, Jan 21, 2015 at 7:43 PM, Yatong Zhang <bluefl...@gmail.com> wrote: > >> Thanks for the reply. The bootstrap of new node put a heavy burden on the >> whole cluster and I don't know why. So that' the issue I want to fix >> actually. >> >> On Mon, Jan 12, 2015 at 6:08 AM, Eric Stevens <migh...@gmail.com> wrote: >> >>> Yes, but it won't do what I suspect you're hoping for. If you disable >>> auto_bootstrap in cassandra.yaml the node will join the cluster and will >>> not stream any old data from existing nodes. >>> >>> The cluster will now be in an inconsistent state. If you bring enough >>> nodes online this way to violate your read consistency level (eg RF=3, >>> CL=Quorum, if you bring on 2 nodes this way), some of your queries will be >>> missing data that they ought to have returned. >>> >>> There is no way to bring a new node online and have it be responsible >>> just for new data, and have no responsibility for old data. It *will* be >>> responsible for old data, it just won't *know* about the old data it >>> should be responsible for. Executing a repair will fix this, but only >>> because the existing nodes will stream all the missing data to the new >>> node. This will create more pressure on your cluster than just normal >>> bootstrapping would have. >>> >>> I can't think of any reason you'd want to do that unless you needed to >>> grow your cluster really quickly, and were ok with corrupting your old data. >>> >>> On Sat, Jan 10, 2015 at 12:39 AM, Yatong Zhang <bluefl...@gmail.com> >>> wrote: >>> >>>> Hi there, >>>> >>>> I am using C* 2.0.10 and I was trying to add a new node to a >>>> cluster(actually replace a dead node). But after added the new node some >>>> other nodes in the cluster had a very high work-load and affected the whole >>>> performance of the cluster. >>>> So I am wondering is there a way to add a new node and this node only >>>> afford new data? >>>> >>> >>> >> >