My nodes all have themselves in their list of seeds - always did - and everything works. (You may ask why I did this. I don't know, I must have copied it from an example somewhere.)
On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory <ran...@gmail.com> wrote: > I was able to make the node join the ring but I'm confused. > What I did is, first when adding the node, this node was not in the seeds > list of itself. AFAIK this is how it's supposed to be. So it was able to > transfer all data to itself from other nodes but then it stayed in the > bootstrapping state. > So what I did (and I don't know why it works), is add this node to the > seeds list in its own storage-conf.xml file. Then restart the server and > then I finally see it in the ring... > If I had added the node to the seeds list of itself when first joining it, > it would not join the ring but if I do it in two phases it did work. > So it's either my misunderstanding or a bug... > > > On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory <ran...@gmail.com> wrote: > >> The new node does not see itself as part of the ring, it sees all others >> but itself, so from that perspective the view is consistent. >> The only problem is that the node never finishes to bootstrap. It stays in >> this state for hours (It's been 20 hours now...) >> >> >> $ bin/nodetool -p 9004 -h localhost streams >>> Mode: Bootstrapping >>> Not sending any streams. >>> Not receiving any streams. >> >> >> On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall <n...@riptano.com> wrote: >> >>> Does the new node have itself in the list of seeds per chance? This >>> could cause some issues if so. >>> >>> On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory <ran...@gmail.com> wrote: >>> > I'm still at lost. I haven't been able to resolve this. I tried >>> > adding another node at a different location on the ring but this node >>> > too remains stuck in the bootstrapping state for many hours without >>> > any of the other nodes being busy with anti compaction or anything >>> > else. I don't know what's keeping it from finishing the bootstrap,no >>> > CPU, no io, files were already streamed so what is it waiting for? >>> > I read the release notes of 0.6.7 and 0.6.8 and there didn't seem to >>> > be anything addressing a similar issue so I figured there was no point >>> > in upgrading. But let me know if you think there is. >>> > Or any other advice... >>> > >>> > On Tuesday, January 4, 2011, Ran Tavory <ran...@gmail.com> wrote: >>> >> Thanks Jake, but unfortunately the streams directory is empty so I >>> don't think that any of the nodes is anti-compacting data right now or had >>> been in the past 5 hours. It seems that all the data was already transferred >>> to the joining host but the joining node, after having received the data >>> would still remain in bootstrapping mode and not join the cluster. I'm not >>> sure that *all* data was transferred (perhaps other nodes need to transfer >>> more data) but nothing is actually happening so I assume all has been moved. >>> >> Perhaps it's a configuration error from my part. Should I use I use >>> AutoBootstrap=true ? Anything else I should look out for in the >>> configuration file or something else? >>> >> >>> >> >>> >> On Tue, Jan 4, 2011 at 4:08 PM, Jake Luciani <jak...@gmail.com> >>> wrote: >>> >> >>> >> In 0.6, locate the node doing anti-compaction and look in the >>> "streams" subdirectory in the keyspace data dir to monitor the >>> anti-compaction progress (it puts new SSTables for bootstrapping node in >>> there) >>> >> >>> >> >>> >> On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory <ran...@gmail.com> wrote: >>> >> >>> >> >>> >> Running nodetool decommission didn't help. Actually the node refused >>> to decommission itself (b/c it wasn't part of the ring). So I simply stopped >>> the process, deleted all the data directories and started it again. It >>> worked in the sense of the node bootstrapped again but as before, after it >>> had finished moving the data nothing happened for a long time (I'm still >>> waiting, but nothing seems to be happening). >>> >> >>> >> >>> >> >>> >> >>> >> Any hints how to analyze a "stuck" bootstrapping node??thanks >>> >> On Tue, Jan 4, 2011 at 1:51 PM, Ran Tavory <ran...@gmail.com> wrote: >>> >> Thanks Shimi, so indeed anticompaction was run on one of the other >>> nodes from the same DC but to my understanding it has already ended. A few >>> hour ago... >>> >> >>> >> >>> >> >>> >> I plenty of log messages such as [1] which ended a couple of hours >>> ago, and I've seen the new node streaming and accepting the data from the >>> node which performed the anticompaction and so far it was normal so it >>> seemed that data is at its right place. But now the new node seems sort of >>> stuck. None of the other nodes is anticompacting right now or had been >>> anticompacting since then. >>> >> >>> >> >>> >> >>> >> >>> >> The new node's CPU is close to zero, it's iostats are almost zero so I >>> can't find another bottleneck that would keep it hanging. >>> >> On the IRC someone suggested I'd maybe retry to join this node, >>> e.g. decommission and rejoin it again. I'll try it now... >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> [1] INFO [COMPACTION-POOL:1] 2011-01-04 04:04:09,721 >>> CompactionManager.java (line 338) AntiCompacting >>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')] >>> >> >>> >> >>> >> >>> >> >>> >> INFO [COMPACTION-POOL:1] 2011-01-04 04:34:18,683 >>> CompactionManager.java (line 338) AntiCompacting >>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3874-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3873-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3876-Data.db')] >>> >> >>> >> >>> >> >>> >> >>> >> INFO [COMPACTION-POOL:1] 2011-01-04 04:34:19,132 >>> CompactionManager.java (line 338) AntiCompacting >>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-951-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-976-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-978-Data.db')] >>> >> >>> >> >>> >> >>> >> >>> >> INFO [COMPACTION-POOL:1] 2011-01-04 04:34:26,486 >>> CompactionManager.java (line 338) AntiCompacting >>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')] >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> On Tue, Jan 4, 2011 at 12:45 PM, shimi <shim...@gmail.com> wrote: >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> In my experience most of the time it takes for a node to join the >>> cluster is the anticompaction on the other nodes. The streaming part is very >>> fast. >>> >> Check the other nodes logs to see if there is any node doing >>> anticompaction.I don't remember how much data I had in the cluster when I >>> needed to add/remove nodes. I do remember that it took a few hours. >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> The node will join the ring only when it will finish the bootstrap. >>> >> -- >>> >> /Ran >>> >> >>> >> >>> > >>> > -- >>> > /Ran >>> > >>> >> >> >> >> -- >> /Ran >> >> > > > -- > /Ran > >