Re: Bootstrapping taking long

Nate McCall Tue, 04 Jan 2011 15:20:32 -0800

Does the new node have itself in the list of seeds per chance? This
could cause some issues if so.


On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory <ran...@gmail.com> wrote:
> I'm still at lost.   I haven't been able to resolve this. I tried
> adding another node at a different location on the ring but this node
> too remains stuck in the bootstrapping state for many hours without
> any of the other nodes being busy with anti compaction or anything
> else. I don't know what's keeping it from finishing the bootstrap,no
> CPU, no io, files were already streamed so what is it waiting for?
> I read the release notes of 0.6.7 and 0.6.8 and there didn't seem to
> be anything addressing a similar issue so I figured there was no point
> in upgrading. But let me know if you think there is.
> Or any other advice...
>
> On Tuesday, January 4, 2011, Ran Tavory <ran...@gmail.com> wrote:
>> Thanks Jake, but unfortunately the streams directory is empty so I don't 
>> think that any of the nodes is anti-compacting data right now or had been in 
>> the past 5 hours. It seems that all the data was already transferred to the 
>> joining host but the joining node, after having received the data would 
>> still remain in bootstrapping mode and not join the cluster. I'm not sure 
>> that *all* data was transferred (perhaps other nodes need to transfer more 
>> data) but nothing is actually happening so I assume all has been moved.
>> Perhaps it's a configuration error from my part. Should I use I use 
>> AutoBootstrap=true ? Anything else I should look out for in the 
>> configuration file or something else?
>>
>>
>> On Tue, Jan 4, 2011 at 4:08 PM, Jake Luciani <jak...@gmail.com> wrote:
>>
>> In 0.6, locate the node doing anti-compaction and look in the "streams" 
>> subdirectory in the keyspace data dir to monitor the anti-compaction 
>> progress (it puts new SSTables for bootstrapping node in there)
>>
>>
>> On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory <ran...@gmail.com> wrote:
>>
>>
>> Running nodetool decommission didn't help. Actually the node refused to 
>> decommission itself (b/c it wasn't part of the ring). So I simply stopped 
>> the process, deleted all the data directories and started it again. It 
>> worked in the sense of the node bootstrapped again but as before, after it 
>> had finished moving the data nothing happened for a long time (I'm still 
>> waiting, but nothing seems to be happening).
>>
>>
>>
>>
>> Any hints how to analyze a "stuck" bootstrapping node??thanks
>> On Tue, Jan 4, 2011 at 1:51 PM, Ran Tavory <ran...@gmail.com> wrote:
>> Thanks Shimi, so indeed anticompaction was run on one of the other nodes 
>> from the same DC but to my understanding it has already ended. A few hour 
>> ago...
>>
>>
>>
>> I plenty of log messages such as [1] which ended a couple of hours ago, and 
>> I've seen the new node streaming and accepting the data from the node which 
>> performed the anticompaction and so far it was normal so it seemed that data 
>> is at its right place. But now the new node seems sort of stuck. None of the 
>> other nodes is anticompacting right now or had been anticompacting since 
>> then.
>>
>>
>>
>>
>> The new node's CPU is close to zero, it's iostats are almost zero so I can't 
>> find another bottleneck that would keep it hanging.
>> On the IRC someone suggested I'd maybe retry to join this node, 
>> e.g. decommission and rejoin it again. I'll try it now...
>>
>>
>>
>>
>>
>>
>> [1] INFO [COMPACTION-POOL:1] 2011-01-04 04:04:09,721 CompactionManager.java 
>> (line 338) AntiCompacting 
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>>
>>
>>
>>
>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:18,683 CompactionManager.java 
>> (line 338) AntiCompacting 
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3874-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3873-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3876-Data.db')]
>>
>>
>>
>>
>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:19,132 CompactionManager.java 
>> (line 338) AntiCompacting 
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-951-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-976-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-978-Data.db')]
>>
>>
>>
>>
>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:26,486 CompactionManager.java 
>> (line 338) AntiCompacting 
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>>
>>
>>
>>
>>
>> On Tue, Jan 4, 2011 at 12:45 PM, shimi <shim...@gmail.com> wrote:
>>
>>
>>
>>
>>
>> In my experience most of the time it takes for a node to join the cluster is 
>> the anticompaction on the other nodes. The streaming part is very fast.
>> Check the other nodes logs to see if there is any node doing 
>> anticompaction.I don't remember how much data I had in the cluster when I 
>> needed to add/remove nodes. I do remember that it took a few hours.
>>
>>
>>
>>
>>
>>
>> The node will join the ring only when it will finish the bootstrap.
>> --
>> /Ran
>>
>>
>
> --
> /Ran
>

Re: Bootstrapping taking long

Reply via email to