Re: Bootstrapping taking long

Edward Capriolo Wed, 05 Jan 2011 07:19:32 -0800

On Wed, Jan 5, 2011 at 10:05 AM, Ran Tavory <ran...@gmail.com> wrote:
> In storage-conf I see this comment [1] from which I understand that the
> recommended way to bootstrap a new node is to set AutoBootstrap=true and
> remove itself from the seeds list.
> Moreover, I did try to set AutoBootstrap=true and have the node in its own
> seeds list, but it would not bootstrap. I don't recall the exact message but
> it was something like "I found myself in the seeds list therefore I'm not
> going to bootstrap even though AutoBootstrap is true".
>
> [1]
>   <!--
>    ~ Turn on to make new [non-seed] nodes automatically migrate the right
> data
>    ~ to themselves.  (If no InitialToken is specified, they will pick one
>    ~ such that they will get half the range of the most-loaded node.)
>    ~ If a node starts up without bootstrapping, it will mark itself
> bootstrapped
>    ~ so that you can't subsequently accidently bootstrap a node with
>    ~ data on it.  (You can reset this by wiping your data and commitlog
>    ~ directories.)
>    ~
>    ~ Off by default so that new clusters and upgraders from 0.4 don't
>    ~ bootstrap immediately.  You should turn this on when you start adding
>    ~ new nodes to a cluster that already has data on it.  (If you are
> upgrading
>    ~ from 0.4, start your cluster with it off once before changing it to
> true.
>    ~ Otherwise, no data will be lost but you will incur a lot of unnecessary
>    ~ I/O before your cluster starts up.)
>   -->
>   <AutoBootstrap>false</AutoBootstrap>
> On Wed, Jan 5, 2011 at 4:58 PM, David Boxenhorn <da...@lookin2.com> wrote:
>>
>> If "seed list should be the same across the cluster" that means that nodes
>> *should* have themselves as a seed. If that doesn't work for Ran, then that
>> is the first problem, no?
>>
>>
>> On Wed, Jan 5, 2011 at 3:56 PM, Jake Luciani <jak...@gmail.com> wrote:
>>>
>>> Well your ring issues don't make sense to me, seed list should be the
>>> same across the cluster.
>>> I'm just thinking of other things to try, non-boostrapped nodes should
>>> join the ring instantly but reads will fail if you aren't using quorum.
>>>
>>> On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory <ran...@gmail.com> wrote:
>>>>
>>>> I haven't tried repair.  Should I?
>>>>
>>>> On Jan 5, 2011 3:48 PM, "Jake Luciani" <jak...@gmail.com> wrote:
>>>> > Have you tried not bootstrapping but setting the token and manually
>>>> > calling
>>>> > repair?
>>>> >
>>>> > On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory <ran...@gmail.com> wrote:
>>>> >
>>>> >> My conclusion is lame: I tried this on several hosts and saw the same
>>>> >> behavior, the only way I was able to join new nodes was to first
>>>> >> start them
>>>> >> when they are *not in* their own seeds list and after they
>>>> >> finish transferring the data, then restart them with themselves *in*
>>>> >> their
>>>> >> own seeds list. After doing that the node would join the ring.
>>>> >> This is either my misunderstanding or a bug, but the only place I
>>>> >> found it
>>>> >> documented stated that the new node should not be in its own seeds
>>>> >> list.
>>>> >> Version 0.6.6.
>>>> >>
>>>> >> On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn
>>>> >> <da...@lookin2.com>wrote:
>>>> >>
>>>> >>> My nodes all have themselves in their list of seeds - always did -
>>>> >>> and
>>>> >>> everything works. (You may ask why I did this. I don't know, I must
>>>> >>> have
>>>> >>> copied it from an example somewhere.)
>>>> >>>
>>>> >>> On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory <ran...@gmail.com> wrote:
>>>> >>>
>>>> >>>> I was able to make the node join the ring but I'm confused.
>>>> >>>> What I did is, first when adding the node, this node was not in the
>>>> >>>> seeds
>>>> >>>> list of itself. AFAIK this is how it's supposed to be. So it was
>>>> >>>> able to
>>>> >>>> transfer all data to itself from other nodes but then it stayed in
>>>> >>>> the
>>>> >>>> bootstrapping state.
>>>> >>>> So what I did (and I don't know why it works), is add this node to
>>>> >>>> the
>>>> >>>> seeds list in its own storage-conf.xml file. Then restart the
>>>> >>>> server and
>>>> >>>> then I finally see it in the ring...
>>>> >>>> If I had added the node to the seeds list of itself when first
>>>> >>>> joining
>>>> >>>> it, it would not join the ring but if I do it in two phases it did
>>>> >>>> work.
>>>> >>>> So it's either my misunderstanding or a bug...
>>>> >>>>
>>>> >>>>
>>>> >>>> On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory <ran...@gmail.com>
>>>> >>>> wrote:
>>>> >>>>
>>>> >>>>> The new node does not see itself as part of the ring, it sees all
>>>> >>>>> others
>>>> >>>>> but itself, so from that perspective the view is consistent.
>>>> >>>>> The only problem is that the node never finishes to bootstrap. It
>>>> >>>>> stays
>>>> >>>>> in this state for hours (It's been 20 hours now...)
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> $ bin/nodetool -p 9004 -h localhost streams
>>>> >>>>>> Mode: Bootstrapping
>>>> >>>>>> Not sending any streams.
>>>> >>>>>> Not receiving any streams.
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall <n...@riptano.com>
>>>> >>>>> wrote:
>>>> >>>>>
>>>> >>>>>> Does the new node have itself in the list of seeds per chance?
>>>> >>>>>> This
>>>> >>>>>> could cause some issues if so.
>>>> >>>>>>
>>>> >>>>>> On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory <ran...@gmail.com>
>>>> >>>>>> wrote:
>>>> >>>>>> > I'm still at lost. I haven't been able to resolve this. I tried
>>>> >>>>>> > adding another node at a different location on the ring but
>>>> >>>>>> > this node
>>>> >>>>>> > too remains stuck in the bootstrapping state for many hours
>>>> >>>>>> > without
>>>> >>>>>> > any of the other nodes being busy with anti compaction or
>>>> >>>>>> > anything
>>>> >>>>>> > else. I don't know what's keeping it from finishing the
>>>> >>>>>> > bootstrap,no
>>>> >>>>>> > CPU, no io, files were already streamed so what is it waiting
>>>> >>>>>> > for?
>>>> >>>>>> > I read the release notes of 0.6.7 and 0.6.8 and there didn't
>>>> >>>>>> > seem to
>>>> >>>>>> > be anything addressing a similar issue so I figured there was
>>>> >>>>>> > no
>>>> >>>>>> point
>>>> >>>>>> > in upgrading. But let me know if you think there is.
>>>> >>>>>> > Or any other advice...
>>>> >>>>>> >
>>>> >>>>>> > On Tuesday, January 4, 2011, Ran Tavory <ran...@gmail.com>
>>>> >>>>>> > wrote:
>>>> >>>>>> >> Thanks Jake, but unfortunately the streams directory is empty
>>>> >>>>>> >> so I
>>>> >>>>>> don't think that any of the nodes is anti-compacting data right
>>>> >>>>>> now or had
>>>> >>>>>> been in the past 5 hours. It seems that all the data was already
>>>> >>>>>> transferred
>>>> >>>>>> to the joining host but the joining node, after having received
>>>> >>>>>> the data
>>>> >>>>>> would still remain in bootstrapping mode and not join the
>>>> >>>>>> cluster. I'm not
>>>> >>>>>> sure that *all* data was transferred (perhaps other nodes need to
>>>> >>>>>> transfer
>>>> >>>>>> more data) but nothing is actually happening so I assume all has
>>>> >>>>>> been moved.
>>>> >>>>>> >> Perhaps it's a configuration error from my part. Should I use
>>>> >>>>>> >> I use
>>>> >>>>>> AutoBootstrap=true ? Anything else I should look out for in the
>>>> >>>>>> configuration file or something else?
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >> On Tue, Jan 4, 2011 at 4:08 PM, Jake Luciani
>>>> >>>>>> >> <jak...@gmail.com>
>>>> >>>>>> wrote:
>>>> >>>>>> >>
>>>> >>>>>> >> In 0.6, locate the node doing anti-compaction and look in the
>>>> >>>>>> "streams" subdirectory in the keyspace data dir to monitor the
>>>> >>>>>> anti-compaction progress (it puts new SSTables for bootstrapping
>>>> >>>>>> node in
>>>> >>>>>> there)
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >> On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory <ran...@gmail.com>
>>>> >>>>>> wrote:
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >> Running nodetool decommission didn't help. Actually the node
>>>> >>>>>> >> refused
>>>> >>>>>> to decommission itself (b/c it wasn't part of the ring). So I
>>>> >>>>>> simply stopped
>>>> >>>>>> the process, deleted all the data directories and started it
>>>> >>>>>> again. It
>>>> >>>>>> worked in the sense of the node bootstrapped again but as before,
>>>> >>>>>> after it
>>>> >>>>>> had finished moving the data nothing happened for a long time
>>>> >>>>>> (I'm still
>>>> >>>>>> waiting, but nothing seems to be happening).
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >> Any hints how to analyze a "stuck" bootstrapping node??thanks
>>>> >>>>>> >> On Tue, Jan 4, 2011 at 1:51 PM, Ran Tavory <ran...@gmail.com>
>>>> >>>>>> wrote:
>>>> >>>>>> >> Thanks Shimi, so indeed anticompaction was run on one of the
>>>> >>>>>> >> other
>>>> >>>>>> nodes from the same DC but to my understanding it has already
>>>> >>>>>> ended. A few
>>>> >>>>>> hour ago...
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >> I plenty of log messages such as [1] which ended a couple of
>>>> >>>>>> >> hours
>>>> >>>>>> ago, and I've seen the new node streaming and accepting the data
>>>> >>>>>> from the
>>>> >>>>>> node which performed the anticompaction and so far it was normal
>>>> >>>>>> so it
>>>> >>>>>> seemed that data is at its right place. But now the new node
>>>> >>>>>> seems sort of
>>>> >>>>>> stuck. None of the other nodes is anticompacting right now or had
>>>> >>>>>> been
>>>> >>>>>> anticompacting since then.
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >> The new node's CPU is close to zero, it's iostats are almost
>>>> >>>>>> >> zero so
>>>> >>>>>> I can't find another bottleneck that would keep it hanging.
>>>> >>>>>> >> On the IRC someone suggested I'd maybe retry to join this
>>>> >>>>>> >> node,
>>>> >>>>>> e.g. decommission and rejoin it again. I'll try it now...
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >> [1] INFO [COMPACTION-POOL:1] 2011-01-04 04:04:09,721
>>>> >>>>>> CompactionManager.java (line 338) AntiCompacting
>>>> >>>>>>
>>>> >>>>>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >> INFO [COMPACTION-POOL:1] 2011-01-04 04:34:18,683
>>>> >>>>>> CompactionManager.java (line 338) AntiCompacting
>>>> >>>>>>
>>>> >>>>>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3874-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3873-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3876-Data.db')]
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >> INFO [COMPACTION-POOL:1] 2011-01-04 04:34:19,132
>>>> >>>>>> CompactionManager.java (line 338) AntiCompacting
>>>> >>>>>>
>>>> >>>>>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-951-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-976-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-978-Data.db')]
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >> INFO [COMPACTION-POOL:1] 2011-01-04 04:34:26,486
>>>> >>>>>> CompactionManager.java (line 338) AntiCompacting
>>>> >>>>>>
>>>> >>>>>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >> On Tue, Jan 4, 2011 at 12:45 PM, shimi <shim...@gmail.com>
>>>> >>>>>> >> wrote:
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >> In my experience most of the time it takes for a node to join
>>>> >>>>>> >> the
>>>> >>>>>> cluster is the anticompaction on the other nodes. The streaming
>>>> >>>>>> part is very
>>>> >>>>>> fast.
>>>> >>>>>> >> Check the other nodes logs to see if there is any node doing
>>>> >>>>>> anticompaction.I don't remember how much data I had in the
>>>> >>>>>> cluster when I
>>>> >>>>>> needed to add/remove nodes. I do remember that it took a few
>>>> >>>>>> hours.
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >> The node will join the ring only when it will finish the
>>>> >>>>>> >> bootstrap.
>>>> >>>>>> >> --
>>>> >>>>>> >> /Ran
>>>> >>>>>> >>
>>>> >>>>>> >>
>>>> >>>>>> >
>>>> >>>>>> > --
>>>> >>>>>> > /Ran
>>>> >>>>>> >
>>>> >>>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> --
>>>> >>>>> /Ran
>>>> >>>>>
>>>> >>>>>
>>>> >>>>
>>>> >>>>
>>>> >>>> --
>>>> >>>> /Ran
>>>> >>>>
>>>> >>>>
>>>> >>>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> /Ran
>>>> >>
>>>> >>
>>>
>>
>
>
>
> --
> /Ran
>


If non-auto-bootstrap nodes to not join they check to make sure good
old iptables is not on.

Edward

Re: Bootstrapping taking long

Reply via email to