Hi, Did you solve this problem? I'm having the same poblem. I'm trying to bootstrap a third node in a 0.66 cluster. It has two keyspaces: Keyspace1 and KeyspaceLogs, both with replication factor 2.
It starts bootstrapping, receives some streams but it keeps waiting for streams. I enabled the debug mode. This lines may be useful: DEBUG [main] 2010-11-07 17:39:50,052 BootStrapper.java (line 70) Beginning bootstrap process DEBUG [main] 2010-11-07 17:39:50,082 StorageService.java (line 160) Added / 10.204.93.16/Keyspace1 as a bootstrap source ... DEBUG [main] 2010-11-07 17:39:50,090 StorageService.java (line 160) Added / 10.204.93.16/KeyspaceLogs as a bootstrap source ... (streaming mesages) DEBUG [Thread-56] 2010-11-07 17:45:51,706 StorageService.java (line 171) Removed /10.204.93.16/Keyspace1 as a bootstrap source; remaining is [/ 10.204.93.16] ... (and never ends). It seems it is waiting for [/10.204.93.16] when it should be waiting for / 10.204.93.16/KeyspaceLogs. The third node is 64 bits, while the two existing nodes are 32 bits. Can this be a problem? Thank you. 2010/10/28 Dimitry Lvovsky <dimi...@reviewpro.com> > Maybe your <StoragePort>7000</StoragePort> is being blocked by iptables > or some firewall or maybe you have it bound (<ListenAddress> tag ) to > localhost instead an ip address. > > Hope this helps, > Dimitry. > > > > On Thu, Oct 28, 2010 at 5:35 PM, Thibaut Britz < > thibaut.br...@trendiction.com> wrote: > >> Hi, >> >> I have the same problem with 0.6.5 >> >> New nodes will hang forever in bootstrap mode (no streams are being >> opened) and the receiver thread just waits for data forever: >> >> >> INFO [Thread-53] 2010-10-27 20:33:37,399 SSTableReader.java (line 120) >> Sampling index for /hd2/cassandra/data/table_xyz/ >> table_xyz-3-Data.db >> INFO [Thread-53] 2010-10-27 20:33:37,444 StreamCompletionHandler.java >> (line 64) Streaming added /hd2/cassandra/data/table_xyz/table_xyz-3-Data.db >> >> Stacktracke: >> >> "pool-1-thread-53" prio=10 tid=0x00000000412f2800 nid=0x215c runnable >> [0x00007fd7cf217000] >> java.lang.Thread.State: RUNNABLE >> at java.net.SocketInputStream.socketRead0(Native Method) >> at java.net.SocketInputStream.read(SocketInputStream.java:129) >> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) >> at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) >> at java.io.BufferedInputStream.read(BufferedInputStream.java:317) >> - locked <0x00007fd7e77e0520> (a java.io.BufferedInputStream) >> at >> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:126) >> at >> org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) >> at >> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314) >> at >> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262) >> at >> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192) >> at >> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:1154) >> at >> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> at java.lang.Thread.run(Thread.java:662) >> >> >> >> >> >> >> >> >> >> >> >> >> On Thu, Oct 28, 2010 at 12:44 PM, aaron morton >> <aa...@thelastpickle.com>wrote: >> >>> The best approach is to manually select the tokens, see the Load >>> Balancing section http://wiki.apache.org/cassandra/Operations Also >>> >>> Are there any log messages in the existing nodes or the new one which >>> mention each other? >>> >>> Is this a production system? Is it still running ? >>> >>> Sorry there is not a lot to go on, it sounds like you've done the right >>> thing. I'm assuming things like the Cluster Name, seed list and port numbers >>> are set correct as the new node got some data. >>> >>> You'll need to dig through the logs a bit more to see that the boot >>> strapping started and what was the last message it logged. >>> >>> Good Luck. >>> Aaron >>> >>> On 27 Oct 2010, at 22:40, Dimitry Lvovsky wrote: >>> >>> Hi Aaron, >>> Thanks for your reply. >>> >>> We still haven't solved this unfortunately. >>> >>> How did you start the bootstrap for the .18 node ? >>> >>> >>> Standard way: we set "AutoBootstrap" to true and added all the servers >>> from the working ring as seeds. >>> >>> >>>> Was it the .18 or the .17 node you tried to add >>> >>> >>> We first tried adding .17, it streamed for a while, took on a 50GB of >>> load, stopped streaming but then didn't enter into the ring. We left it for >>> a few days to see if it would come in, but no luck. After that we did >>> decommission and removeToken ( in that order) operations. >>> Since we couldn't get .17 in we tried again with .18. Before doing so we >>> increased the RpcTimeoutInMillis from 1000, to 10000 having read that this >>> may cause the problem of nodes not entering into the ring. It's been going >>> since friday and still, like .17, won't come into the ring. >>> >>> Does it have a token in the config or did you use nodetool move to set it >>> >>> No we didn't manually set the token in the config, rather we were >>> relaying on the token to be assigned durring bootstrap from the >>> RandomPartitioner. >>> >>> Again thanks for the help. >>> >>> Dimitry. >>> >>> >>> >>> On Tue, Oct 26, 2010 at 10:14 PM, Aaron Morton >>> <aa...@thelastpickle.com>wrote: >>> >>>> Dimitry, Did you get anywhere with this ? >>>> >>>> Was it the .18 or the .17 node you tried to add ? How did you start the >>>> bootstrap for the .18 node ? Does it have a token in the config or did you >>>> use nodetool move to set it? >>>> >>>> I had a quick look at the code AKAIK the message about removing the fat >>>> client is logged when the node does not have a record of the token the >>>> other >>>> node as. >>>> >>>> Aaron >>>> >>>> On 26 Oct, 2010,at 10:42 PM, Dimitry Lvovsky <dimi...@reviewpro.com> >>>> wrote: >>>> >>>> Hi All, >>>> We recently upgraded from .65 to .66 after which we tried adding a new >>>> node to our cluster. We left it bootstrapping and after 3 days, it still >>>> refused to join the ring. The strange thing is that nodetool info shows >>>> 50GB >>>> of load and nodetool ring shows that it sees the rest of ring, which it is >>>> not part of. We tried the process again with another server -- again the >>>> same thing as before: >>>> >>>> >>>> //from machine 192.168.218 >>>> >>>> >>>> /opt/cassandra/bin/nodetool -h localhost -p 8999 info >>>> 131373516047318302934572185119435768941 >>>> Load : 52.85 GB >>>> Generation No : 1287761987 >>>> Uptime (seconds) : 323157 >>>> Heap Memory (MB) : 795.42 / 1945.63 >>>> >>>> >>>> /opt/cassandra/bin/nodetool -h localhost -p 8999 ring >>>> Address Status Load Range Ring >>>> 158573510920250391466717289405976537674 >>>> 192.168.2.22 Up 59.45 GB 28203205416427384773583427414698832202 |<--| >>>> 192.168.2.23 Up 44.95 GB 60562227403709245514637766500430120055 | | >>>> 192.168.2.20 Up 47.15 GB 104160057322065544623939416372654814065 | | >>>> 192.168.2.21 Up 61.04 GB 158573510920250391466717289405976537674 |-->| >>>> >>>> opt/cassandra/bin/nodetool -h localhost -p 8999 streams >>>> Mode: Bootstrapping >>>> Not sending any streams. >>>> Not receiving any streams. >>>> >>>> >>>> Whats more, while looking at the log of one of the nodes I see gossip >>>> messages from 192.168.1.17 -- the first node we tried to add to the cluster >>>> but which is not running at the the time of the log message: >>>> INFO [Timer-0] 2010-10-26 02:13:20,340 Gossiper.java (line 406) >>>> FatClient /192.168.2.17 has been silent for 3600000ms, removing from >>>> gossip >>>> INFO [GMFD:1] 2010-10-26 02:13:51,398 Gossiper.java (line 591) Node / >>>> 192.168.2.17 is now part of the cluster >>>> >>>> >>>> Thanks in advance for the help, >>>> Dimitry >>>> >>>> >>> >>> >>> -- >>> Dimitry Lvovsky >>> Director of Engineering >>> ReviewPro >>> www.reviewpro.com >>> +34 616 337 103 >>> >>> >>> >> > > > -- > Dimitry Lvovsky > Director of Engineering > ReviewPro > www.reviewpro.com > +34 616 337 103 >