There is a bug where a node without schema can not bootstrap. Do you have
schema?


On Tue, Feb 18, 2014 at 1:29 PM, Arindam Barua <aba...@247-inc.com> wrote:

>
>
> The node is still out of the ring. Any suggestions on how to get it in
> will be very helpful.
>
>
>
> *From:* Arindam Barua [mailto:aba...@247-inc.com]
> *Sent:* Friday, February 14, 2014 1:04 AM
> *To:* user@cassandra.apache.org
> *Subject:* Bootstrap stuck: vnode enabled 1.2.12
>
>
>
>
>
> After our otherwise successful upgrade procedure to enable vnodes, when
> adding back "new" hosts to our cluster, one non-seed host ran into a
> hardware issue during bootstrap. By the time the hardware issue was fixed a
> week later, all other nodes were added successfully, cleaned, repaired. The
> disks on this node were untouched, and when the node was started back up,
> it detected an interrupted bootstrap, and attempted to bootstrap. However,
> after ~24 hrs it was still stuck in the 'JOINING' state according to
> nodetool netstats on that node, even though no streams were flowing to/from
> it. Also, it did not appear in nodetool status in any way/form (not even as
> JOINING).
>
>
>
> From couple of observed thread dumps, the stack of the thread blocked
> during bootstrap is at [1].
>
>
>
> Since the node wasn't making any progress, I ended up stopping Cassandra,
> cleaning up the data and commitlog directories, and attempted a fresh
> bootstrap. Nodetool netstats immediately reported a whole bunch of streams
> queued up, and data started streaming to the node. The data directory
> quickly grew to 18 GB (the other nodes had ~25GB, but we have lot of data
> with low TTLs). However, the node ended up being in the earlier reported
> state, i.e. nodetool netstats doesn't have anything queued, but still
> reports the JOINING state, even though it's been > 24 hrs. There are no
> other ERRORS in the logs, and new data being written to the cluster makes
> it to this node just fine, triggering compactions, etc from time to time.
>
>
>
> Any help is appreciated.
>
>
>
> Thanks,
>
> Arindam
>
> [1] Thread dump
> Thread 3708: (state = BLOCKED)
>  - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information
> may
>    be imprecise)
>  - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14,
>    line=156 (Interpreted frame)
>  -
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt()
>    @bci=1, line=811 (Interpreted frame)
>  -
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(int)
>    @bci=55, line=969 (Interpreted frame)
>  -
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(int)
>    @bci=24, line=1281 (Interpreted frame)
>  - java.util.concurrent.CountDownLatch.await() @bci=5, line=207
> (Interpreted
>    frame)
>  - org.apache.cassandra.dht.RangeStreamer.fetch() @bci=209, line=256
>    (Interpreted frame)
>  - org.apache.cassandra.dht.BootStrapper.bootstrap() @bci=120, line=84
>    (Interpreted frame)
>  -
> org.apache.cassandra.service.StorageService.bootstrap(java.util.Collection)
>    @bci=172, line=978 (Interpreted frame)
>  - org.apache.cassandra.service.StorageService.joinTokenRing(int) @bci=827,
>    line=744 (Interpreted frame)
>  - org.apache.cassandra.service.StorageService.initServer(int) @bci=363,
>    line=585 (Interpreted frame)
>  - org.apache.cassandra.service.StorageService.initServer() @bci=4,
> line=482
>    (Interpreted frame)
>  - org.apache.cassandra.service.CassandraDaemon.setup() @bci=1069, line=348
>    (Interpreted frame)
>  - org.apache.cassandra.service.CassandraDaemon.activate() @bci=59,
> line=447
>    (Interpreted frame)
>  - org.apache.cassandra.service.CassandraDaemon.main(java.lang.String[])
> @bci=3,
>    line=490 (Interpreted frame)
>

Reply via email to