Re: Nodes not added to existing cluster

Aaron Morton Thu, 26 Sep 2013 18:04:54 -0700

>  INFO 05:03:49,015 Cannot handshake version with /aa.bb.cc.dd
>  INFO 05:03:49,017 Handshaking version with /aa.bb.cc.dd
If you can turn up logging to TRACE for 
org.apache.cassandra.net.OutboundTcpConnection it will include the full error.


> The two addresses that it is unable to handshake with are the other two 
> addresses of nodes in the cluster I'm unable to join.
Are you mixing versions ? 


Cheers

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 26/09/2013, at 5:13 PM, Skye Book <skye.b...@gmail.com> wrote:

> Hi Aaron, thanks for the clarification.
> 
> As might be expected, having the broadcast_address fixed hasn't fixed 
> anything.  What I did find after writing my last email is that output.log is 
> littered with these:
> 
>  INFO 05:03:49,015 Cannot handshake version with /aa.bb.cc.dd
>  INFO 05:03:49,017 Handshaking version with /aa.bb.cc.dd
>  INFO 05:03:49,803 Cannot handshake version with /ww.xx.yy.zz
>  INFO 05:03:49,805 Handshaking version with /ww.xx.yy.zz
> 
> The two addresses that it is unable to handshake with are the other two 
> addresses of nodes in the cluster I'm unable to join.  I started thinking 
> that maybe EC2 was having an-advertised problem communicating between AZ's 
> but bringing up nodes in both of the other availability zones resulted in the 
> same wrong behavior.
> 
> I've gist'd my cassandra.yaml, its pretty standard and hasn't caused an issue 
> in the past for me.  https://gist.github.com/skyebook/ec9364cdcec02e803ffc
> 
> Skye Book
> http://skyebook.net -- @sbook
> 
> On Sep 26, 2013, at 12:34 AM, Aaron Morton <aa...@thelastpickle.com> wrote:
> 
>>>  I am curious, though, how any of this worked in the first place spread 
>>> across three AZ's without that being set?
>> boradcast_address is only needed when you are going cross region (IIRC it's 
>> the EC2MultiRegionSnitch) that sets it. 
>> 
>> As rob said, make sure the seed list includes on of the other nodes and that 
>> the cluster_name set. 
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> New Zealand
>> @aaronmorton
>> 
>> Co-Founder & Principal Consultant
>> Apache Cassandra Consulting
>> http://www.thelastpickle.com
>> 
>> On 26/09/2013, at 8:12 AM, Skye Book <skye.b...@gmail.com> wrote:
>> 
>>> Thank you, both Michael and Robert for your suggestions.  I actually saw 
>>> 5760, but we were running on 2.0.0, which it seems like this was fixed in.
>>> 
>>> That said, I noticed that my Chef scripts were failing to set the 
>>> broadcast_address correctly, which I'm guessing is the cause of the 
>>> problem, fixing that and trying a redeploy.  I am curious, though, how any 
>>> of this worked in the first place spread across three AZ's without that 
>>> being set?
>>> 
>>> -Skye
>>> 
>>> On Sep 25, 2013, at 3:56 PM, Robert Coli <rc...@eventbrite.com> wrote:
>>> 
>>>> On Wed, Sep 25, 2013 at 12:41 PM, Skye Book <skye.b...@gmail.com> wrote:
>>>> I have a three node cluster using the EC2 Multi-Region Snitch currently 
>>>> operating only in US-EAST.  On having a node go down this morning, I 
>>>> started a new node with an identical configuration, except for the seed 
>>>> list, the listen address and the rpc address.  The new node comes up and 
>>>> creates its own cluster rather than joining the pre-existing ring.  I've 
>>>> tried creating a node both before ad after using `nodetool remove` for the 
>>>> bad node, each time with the same result.
>>>> 
>>>> What version of Cassandra?
>>>> 
>>>> This particular confusing behavior is fixed upstream, in a version you 
>>>> should not deploy to production yet. Take some solace, however, that you 
>>>> may be the last Cassandra administrator to die for a broken code path!
>>>> 
>>>> https://issues.apache.org/jira/browse/CASSANDRA-5768
>>>> 
>>>> Does anyone have any suggestions for where to look that might put me on 
>>>> the right track?
>>>> 
>>>> It must be that your seed list is wrong in some way, or your node state is 
>>>> wrong. If you're trying to bootstrap a node, note that you can't bootstrap 
>>>> a node when it is in its own seed list.
>>>> 
>>>> If you have installed Cassandra via debian package, there is a possibility 
>>>> that your node has started before you explicitly started it. If so, it 
>>>> might have invalid node state.
>>>> 
>>>> Have you tried wiping the data directory and trying again?
>>>> 
>>>> What is your seed list? Are you sure the new node can reach the seeds on 
>>>> the network layer?
>>>> 
>>>> =Rob
>>> 
>> 
>

Re: Nodes not added to existing cluster

Reply via email to