Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
I see. Thanks for claryfing Jonathan. On Wednesday, January 5, 2011, Jonathan Ellis wrote: > 1676 says "Avoid dropping messages off the client request path." > Bootstrap messages are "off the client requst path."  So, if some of > the nodes involved were loaded enough that they were dropping mess

Re: Bootstrapping taking long

2011-01-05 Thread Jonathan Ellis
1676 says "Avoid dropping messages off the client request path." Bootstrap messages are "off the client requst path." So, if some of the nodes involved were loaded enough that they were dropping messages older than RPC_TIMEOUT to cope, it could lose part of the bootstrap communication permanently.

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
OK, thanks, so I see we had the same problem (I too had multiple keyspace, not that I know why it matters to the problem at hand) and I see that by upgrading to 0.6.7 you solved your problem (I didn't try it, had a different workaround) but frankly, I don't understand how https://issues.apache.org/

Re: Bootstrapping taking long

2011-01-05 Thread Thibaut Britz
Had the same Problem a while ago. Upgrading solved the problem (Don't know if you have to redeploy your cluster though) http://www.mail-archive.com/user@cassandra.apache.org/msg07106.html On Wed, Jan 5, 2011 at 4:29 PM, Ran Tavory wrote: > @Thibaut wrong email? Or how's "Avoid dropping message

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
@Thibaut wrong email? Or how's "Avoid dropping messages off the client request path" (CASSANDRA-1676) related to the bootstrap questions I had? On Wed, Jan 5, 2011 at 5:23 PM, Thibaut Britz wrote: > https://issues.apache.org/jira/browse/CASSANDRA-1676 > > you have to use at least 0.6.7 > > > > O

Re: Bootstrapping taking long

2011-01-05 Thread Thibaut Britz
https://issues.apache.org/jira/browse/CASSANDRA-1676 you have to use at least 0.6.7 On Wed, Jan 5, 2011 at 4:19 PM, Edward Capriolo wrote: > On Wed, Jan 5, 2011 at 10:05 AM, Ran Tavory wrote: > > In storage-conf I see this comment [1] from which I understand that the > > recommended way to boo

Re: Bootstrapping taking long

2011-01-05 Thread Edward Capriolo
On Wed, Jan 5, 2011 at 10:05 AM, Ran Tavory wrote: > In storage-conf I see this comment [1] from which I understand that the > recommended way to bootstrap a new node is to set AutoBootstrap=true and > remove itself from the seeds list. > Moreover, I did try to set AutoBootstrap=true and have the

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
In storage-conf I see this comment [1] from which I understand that the recommended way to bootstrap a new node is to set AutoBootstrap=true and remove itself from the seeds list. Moreover, I did try to set AutoBootstrap=true and have the node in its own seeds list, but it would not bootstrap. I do

Re: Bootstrapping taking long

2011-01-05 Thread David Boxenhorn
If "seed list should be the same across the cluster" that means that nodes *should* have themselves as a seed. If that doesn't work for Ran, then that is the first problem, no? On Wed, Jan 5, 2011 at 3:56 PM, Jake Luciani wrote: > Well your ring issues don't make sense to me, seed list should b

Re: Bootstrapping taking long

2011-01-05 Thread Jake Luciani
Well your ring issues don't make sense to me, seed list should be the same across the cluster. I'm just thinking of other things to try, non-boostrapped nodes should join the ring instantly but reads will fail if you aren't using quorum. On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory wrote: > I hav

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
I haven't tried repair. Should I? On Jan 5, 2011 3:48 PM, "Jake Luciani" wrote: > Have you tried not bootstrapping but setting the token and manually calling > repair? > > On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory wrote: > >> My conclusion is lame: I tried this on several hosts and saw the same

Re: Bootstrapping taking long

2011-01-05 Thread Jake Luciani
Have you tried not bootstrapping but setting the token and manually calling repair? On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory wrote: > My conclusion is lame: I tried this on several hosts and saw the same > behavior, the only way I was able to join new nodes was to first start them > when they

Re: Bootstrapping taking long

2011-01-05 Thread David Boxenhorn
I started all my nodes the first time with seeds in their own lists, and it worked. I think I started them in 0.6.1, but I'm not sure. (I'm now using 0.6.8). On Wed, Jan 5, 2011 at 2:07 PM, Ran Tavory wrote: > My conclusion is lame: I tried this on several hosts and saw the same > behavior, the

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
My conclusion is lame: I tried this on several hosts and saw the same behavior, the only way I was able to join new nodes was to first start them when they are *not in* their own seeds list and after they finish transferring the data, then restart them with themselves *in* their own seeds list. Aft

Re: Bootstrapping taking long

2011-01-05 Thread David Boxenhorn
My nodes all have themselves in their list of seeds - always did - and everything works. (You may ask why I did this. I don't know, I must have copied it from an example somewhere.) On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory wrote: > I was able to make the node join the ring but I'm confused. >

Re: Bootstrapping taking long

2011-01-04 Thread Ran Tavory
I was able to make the node join the ring but I'm confused. What I did is, first when adding the node, this node was not in the seeds list of itself. AFAIK this is how it's supposed to be. So it was able to transfer all data to itself from other nodes but then it stayed in the bootstrapping state.

Re: Bootstrapping taking long

2011-01-04 Thread Ran Tavory
The new node does not see itself as part of the ring, it sees all others but itself, so from that perspective the view is consistent. The only problem is that the node never finishes to bootstrap. It stays in this state for hours (It's been 20 hours now...) $ bin/nodetool -p 9004 -h localhost stre

Re: Bootstrapping taking long

2011-01-04 Thread Nate McCall
Does the new node have itself in the list of seeds per chance? This could cause some issues if so. On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory wrote: > I'm still at lost.   I haven't been able to resolve this. I tried > adding another node at a different location on the ring but this node > too re

Re: Bootstrapping taking long

2011-01-04 Thread Ran Tavory
I'm still at lost. I haven't been able to resolve this. I tried adding another node at a different location on the ring but this node too remains stuck in the bootstrapping state for many hours without any of the other nodes being busy with anti compaction or anything else. I don't know what's ke

Re: Bootstrapping taking long

2011-01-04 Thread Ran Tavory
Thanks Jake, but unfortunately the streams directory is empty so I don't think that any of the nodes is anti-compacting data right now or had been in the past 5 hours. It seems that all the data was already transferred to the joining host but the joining node, after having received the data would s

Re: Bootstrapping taking long

2011-01-04 Thread shimi
You will have something new to talk about in your talk tomorrow :) You said that the anti compaction was only on a single node? I think that your new node should get data from at least two other nodes (depending on the replication factor). Maybe the problem is not in the new node. In old version (

Re: Bootstrapping taking long

2011-01-04 Thread Jake Luciani
In 0.6, locate the node doing anti-compaction and look in the "streams" subdirectory in the keyspace data dir to monitor the anti-compaction progress (it puts new SSTables for bootstrapping node in there) On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory wrote: > Running nodetool decommission didn't he

Re: Bootstrapping taking long

2011-01-04 Thread Ran Tavory
Running nodetool decommission didn't help. Actually the node refused to decommission itself (b/c it wasn't part of the ring). So I simply stopped the process, deleted all the data directories and started it again. It worked in the sense of the node bootstrapped again but as before, after it had fin

Re: Bootstrapping taking long

2011-01-04 Thread Ran Tavory
Thanks Shimi, so indeed anticompaction was run on one of the other nodes from the same DC but to my understanding it has already ended. A few hour ago... I plenty of log messages such as [1] which ended a couple of hours ago, and I've seen the new node streaming and accepting the data from the node

Re: Bootstrapping taking long

2011-01-04 Thread shimi
In my experience most of the time it takes for a node to join the cluster is the anticompaction on the other nodes. The streaming part is very fast. Check the other nodes logs to see if there is any node doing anticompaction. I don't remember how much data I had in the cluster when I needed to add/

Bootstrapping taking long

2011-01-04 Thread Ran Tavory
I asked the same question on the IRC but no luck there, everyone's asleep ;)... Using 0.6.6 I'm adding a new node to the cluster. It starts out fine but then gets stuck on the bootstrapping state for too long. More than an hour and still counting. $ bin/nodetool -p 9004 -h localhost streams > Mod