> INFO 05:03:49,015 Cannot handshake version with /aa.bb.cc.dd > INFO 05:03:49,017 Handshaking version with /aa.bb.cc.dd If you can turn up logging to TRACE for org.apache.cassandra.net.OutboundTcpConnection it will include the full error.
> The two addresses that it is unable to handshake with are the other two > addresses of nodes in the cluster I'm unable to join. Are you mixing versions ? Cheers ----------------- Aaron Morton New Zealand @aaronmorton Co-Founder & Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 26/09/2013, at 5:13 PM, Skye Book <skye.b...@gmail.com> wrote: > Hi Aaron, thanks for the clarification. > > As might be expected, having the broadcast_address fixed hasn't fixed > anything. What I did find after writing my last email is that output.log is > littered with these: > > INFO 05:03:49,015 Cannot handshake version with /aa.bb.cc.dd > INFO 05:03:49,017 Handshaking version with /aa.bb.cc.dd > INFO 05:03:49,803 Cannot handshake version with /ww.xx.yy.zz > INFO 05:03:49,805 Handshaking version with /ww.xx.yy.zz > > The two addresses that it is unable to handshake with are the other two > addresses of nodes in the cluster I'm unable to join. I started thinking > that maybe EC2 was having an-advertised problem communicating between AZ's > but bringing up nodes in both of the other availability zones resulted in the > same wrong behavior. > > I've gist'd my cassandra.yaml, its pretty standard and hasn't caused an issue > in the past for me. https://gist.github.com/skyebook/ec9364cdcec02e803ffc > > Skye Book > http://skyebook.net -- @sbook > > On Sep 26, 2013, at 12:34 AM, Aaron Morton <aa...@thelastpickle.com> wrote: > >>> I am curious, though, how any of this worked in the first place spread >>> across three AZ's without that being set? >> boradcast_address is only needed when you are going cross region (IIRC it's >> the EC2MultiRegionSnitch) that sets it. >> >> As rob said, make sure the seed list includes on of the other nodes and that >> the cluster_name set. >> >> Cheers >> >> ----------------- >> Aaron Morton >> New Zealand >> @aaronmorton >> >> Co-Founder & Principal Consultant >> Apache Cassandra Consulting >> http://www.thelastpickle.com >> >> On 26/09/2013, at 8:12 AM, Skye Book <skye.b...@gmail.com> wrote: >> >>> Thank you, both Michael and Robert for your suggestions. I actually saw >>> 5760, but we were running on 2.0.0, which it seems like this was fixed in. >>> >>> That said, I noticed that my Chef scripts were failing to set the >>> broadcast_address correctly, which I'm guessing is the cause of the >>> problem, fixing that and trying a redeploy. I am curious, though, how any >>> of this worked in the first place spread across three AZ's without that >>> being set? >>> >>> -Skye >>> >>> On Sep 25, 2013, at 3:56 PM, Robert Coli <rc...@eventbrite.com> wrote: >>> >>>> On Wed, Sep 25, 2013 at 12:41 PM, Skye Book <skye.b...@gmail.com> wrote: >>>> I have a three node cluster using the EC2 Multi-Region Snitch currently >>>> operating only in US-EAST. On having a node go down this morning, I >>>> started a new node with an identical configuration, except for the seed >>>> list, the listen address and the rpc address. The new node comes up and >>>> creates its own cluster rather than joining the pre-existing ring. I've >>>> tried creating a node both before ad after using `nodetool remove` for the >>>> bad node, each time with the same result. >>>> >>>> What version of Cassandra? >>>> >>>> This particular confusing behavior is fixed upstream, in a version you >>>> should not deploy to production yet. Take some solace, however, that you >>>> may be the last Cassandra administrator to die for a broken code path! >>>> >>>> https://issues.apache.org/jira/browse/CASSANDRA-5768 >>>> >>>> Does anyone have any suggestions for where to look that might put me on >>>> the right track? >>>> >>>> It must be that your seed list is wrong in some way, or your node state is >>>> wrong. If you're trying to bootstrap a node, note that you can't bootstrap >>>> a node when it is in its own seed list. >>>> >>>> If you have installed Cassandra via debian package, there is a possibility >>>> that your node has started before you explicitly started it. If so, it >>>> might have invalid node state. >>>> >>>> Have you tried wiping the data directory and trying again? >>>> >>>> What is your seed list? Are you sure the new node can reach the seeds on >>>> the network layer? >>>> >>>> =Rob >>> >> >