there are some other knock on issues too.  the SSL work that has been
done would also have to be changed ...

-sd

On Tue, Mar 22, 2011 at 6:58 PM, A J <s5a...@gmail.com> wrote:
> Milind,
> Among the limitation you might want to add that 'nodetool repair' does
> not work with this patch.
> I tried several times and the repair hangs.
> When I run it directly on the trunk of 0.7.4 (without the patch) it
> completes successfully within reasonable time.
>
> Thanks.
>
> On Tue, Mar 22, 2011 at 1:07 PM, Jeremy Hanna
> <jeremy.hanna1...@gmail.com> wrote:
>> Never mind - I had thought it was more generalizable but since it's just 
>> going against the public IP between regions, that's not going to be 
>> something that makes it into trunk.  I had just wanted to see if there was a 
>> way that it could be done, but it sounds like since amazon doesn't provide 
>> decent information between regions, something like this workaround patch is 
>> required.
>>
>> Anyway - thanks for the work on this.
>>
>> On Mar 22, 2011, at 8:33 AM, Jeremy Hanna wrote:
>>
>>> Milind,
>>>
>>> Thank you for attaching the patch here, but it would be really nice if you 
>>> could create a jira account so you could participate in the discussion on 
>>> the ticket and put the patch on there - that is the way people license 
>>> their contributions with the apache 2 license.  You just need to create an 
>>> account with the public jira inked off of the ticket at the top.
>>>
>>> Understandable that it would necessarily be a general solution now - but 
>>> it's a start to understanding what would need to be done so that if 
>>> possible, something general could be derived.  I'm just trying to help get 
>>> the discussion started so it could be something that people could do out of 
>>> the box.  Not only that, but also so that it could be tested and evolve 
>>> with the codebase so that people could know that it is hardened and used by 
>>> others.
>>>
>>> Any limitations would be nice to note when you attach the patch to the 
>>> ticket.
>>>
>>> Thanks so much for your work on this!
>>>
>>> Jeremy
>>>
>>> On Mar 21, 2011, at 11:29 PM, Milind Parikh wrote:
>>>
>>>> Patch is attached... I don't have access to Jira.
>>>>
>>>> A cautionery note: This is NOT a general solution and is not intended as 
>>>> such. It could be included as a part of larger patch. I will explain in 
>>>> the limitation sections about why it is not a general solution; as I find 
>>>> time.
>>>>
>>>> Regards
>>>> Milind
>>>>
>>>> On Mon, Mar 21, 2011 at 11:42 PM, Jeremy Hanna 
>>>> <jeremy.hanna1...@gmail.com> wrote:
>>>> Sorry if I was presumptuous earlier.  I created a ticket so that the patch 
>>>> could be submitted and reviewed - that is if it can be generalized so that 
>>>> it works across regions and doesn't adversely affect the common case.
>>>> https://issues.apache.org/jira/browse/CASSANDRA-2362
>>>>
>>>> On Mar 21, 2011, at 10:41 PM, Jeremy Hanna wrote:
>>>>
>>>>> Sorry if I was presumptuous earlier.  I created a ticket so that the 
>>>>> patch could be submitted and reviewed - that is if it can be generalized 
>>>>> so that it works across regions and doesn't adversely affect the common 
>>>>> case.
>>>>> https://issues.apache.org/jira/browse/CASSANDRA-2362
>>>>>
>>>>> On Mar 21, 2011, at 12:20 PM, Jeremy Hanna wrote:
>>>>>
>>>>>> I talked to Matt Dennis in the channel about it and I think everyone 
>>>>>> would like to make sure that cassandra works great across multiple 
>>>>>> regions.  He sounded like he didn't know why it wouldn't work after 
>>>>>> having looked at the patches.  I would like to try it both ways - with 
>>>>>> and without the patches later today if I can and I'd like to help out 
>>>>>> with getting it working out of the box.
>>>>>>
>>>>>> Thanks for the investigative work and documentation Milind!
>>>>>>
>>>>>> Jeremy
>>>>>>
>>>>>> On Mar 21, 2011, at 12:12 PM, Dave Viner wrote:
>>>>>>
>>>>>>> Hi Milind,
>>>>>>>
>>>>>>> Great work here.  Can you provide the patch against the 2 files?
>>>>>>>
>>>>>>> Perhaps there's some way to incorporate it into the trunk of cassandra 
>>>>>>> so that this is feasible (in a future release) without patching the 
>>>>>>> source code.
>>>>>>>
>>>>>>> Dave Viner
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Mar 21, 2011 at 9:41 AM, A J <s5a...@gmail.com> wrote:
>>>>>>> Thanks for sharing the document, Milind !
>>>>>>> Followed the instructions and it worked for me.
>>>>>>>
>>>>>>> On Mon, Mar 21, 2011 at 5:01 AM, Milind Parikh <milindpar...@gmail.com> 
>>>>>>> wrote:
>>>>>>>> Here's the document on Cassandra (0.7.4) across EC2 regions. Clearly 
>>>>>>>> this is
>>>>>>>> work in progress.... but wanted to share what I have. PDF is the 
>>>>>>>> working
>>>>>>>> copy.
>>>>>>>>
>>>>>>>>
>>>>>>>> https://docs.google.com/document/d/175duUNIx7m5mCDa2sjXVI04ekyMa5bdiWdu-AFgisaY/edit?hl=en
>>>>>>>>
>>>>>>>> On Sun, Mar 20, 2011 at 7:49 PM, aaron morton <aa...@thelastpickle.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Recent discussion on the dev list
>>>>>>>>> http://www.mail-archive.com/dev@cassandra.apache.org/msg01832.html
>>>>>>>>> Aaron
>>>>>>>>> On 19 Mar 2011, at 06:46, A J wrote:
>>>>>>>>>
>>>>>>>>> Just to add, all the telnet (port 7000) and cassandra-cli (port 9160)
>>>>>>>>> connections are done using the public DNS (that goes like
>>>>>>>>> ec2-.....compute.amazonaws.com)
>>>>>>>>>
>>>>>>>>> On Fri, Mar 18, 2011 at 1:37 PM, A J <s5a...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> I am able to telnet from one region to another on 7000 port without
>>>>>>>>>
>>>>>>>>> issues. (I get the expected Connected to .....Escape character is
>>>>>>>>>
>>>>>>>>> '^]'.)
>>>>>>>>>
>>>>>>>>> Also I am able to execute cassandra client on 9160 port from one
>>>>>>>>>
>>>>>>>>> region to another without issues (this is when I run cassandra
>>>>>>>>>
>>>>>>>>> separately on each region without forming a cluster).
>>>>>>>>>
>>>>>>>>> So I think the ports 7000 and 9160 are not the issue.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Mar 18, 2011 at 1:26 PM, Dave Viner <davevi...@gmail.com> 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> From the us-west instance, are you able to connect to the us-east 
>>>>>>>>> instance
>>>>>>>>>
>>>>>>>>> using telnet on port 7000 and 9160?
>>>>>>>>>
>>>>>>>>> If not, then you need to open those ports for communication (via your
>>>>>>>>>
>>>>>>>>> Security Group)
>>>>>>>>>
>>>>>>>>> Dave Viner
>>>>>>>>>
>>>>>>>>> On Fri, Mar 18, 2011 at 10:20 AM, A J <s5a...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Thats exactly what I am doing.
>>>>>>>>>
>>>>>>>>> I was able to do the first two scenarios without any issues (i.e. 2
>>>>>>>>>
>>>>>>>>> nodes in same availability zone. Followed by an additional node in a
>>>>>>>>>
>>>>>>>>> different zone but same region)
>>>>>>>>>
>>>>>>>>> I am stuck at the third scenario of separate regions.
>>>>>>>>>
>>>>>>>>> (I did read the "Cassandra nodes on EC2 in two different regions not
>>>>>>>>>
>>>>>>>>> communicating" thread but it did not seem to end with resolution)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Mar 18, 2011 at 1:15 PM, Dave Viner <davevi...@gmail.com> 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi AJ,
>>>>>>>>>
>>>>>>>>> I'd suggest getting to a multi-region cluster step-by-step.  First, 
>>>>>>>>> get
>>>>>>>>>
>>>>>>>>> 2
>>>>>>>>>
>>>>>>>>> nodes running in the same availability zone.  Make sure that works
>>>>>>>>>
>>>>>>>>> properly.
>>>>>>>>>
>>>>>>>>> Second, add a node in a separate availability zone, but in the same
>>>>>>>>>
>>>>>>>>> region.
>>>>>>>>>
>>>>>>>>> Make sure that's working properly.  Third, add a node that's in a
>>>>>>>>>
>>>>>>>>> separate
>>>>>>>>>
>>>>>>>>> region.
>>>>>>>>>
>>>>>>>>> Taking it step-by-step will ensure that any issues are specific to the
>>>>>>>>>
>>>>>>>>> region-to-region communication, rather than intra-zone connectivity or
>>>>>>>>>
>>>>>>>>> cassandra cluster configuration.
>>>>>>>>>
>>>>>>>>> Dave Viner
>>>>>>>>>
>>>>>>>>> On Fri, Mar 18, 2011 at 8:34 AM, A J <s5a...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> I am trying to setup a cassandra cluster across regions.
>>>>>>>>>
>>>>>>>>> For testing I am keeping it simple and just having one node in US-EAST
>>>>>>>>>
>>>>>>>>> (say ec2-1-2-3-4.compute-1.amazonaws.com) and one node in US-WEST (say
>>>>>>>>>
>>>>>>>>> ec2-2-2-3-4.us-west-1.compute.amazonaws.com).
>>>>>>>>>
>>>>>>>>> Using Cassandra 0.7.4
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The one in east region is the seed node and has the values as:
>>>>>>>>>
>>>>>>>>> auto_bootstrap: false
>>>>>>>>>
>>>>>>>>> seeds: ec2-1-2-3-4.compute-1.amazonaws.com
>>>>>>>>>
>>>>>>>>> listen_address: ec2-1-2-3-4.compute-1.amazonaws.com
>>>>>>>>>
>>>>>>>>> rpc_address: 0.0.0.0
>>>>>>>>>
>>>>>>>>> The one in west region is non seed and has the values as:
>>>>>>>>>
>>>>>>>>> auto_bootstrap: true
>>>>>>>>>
>>>>>>>>> seeds: ec2-1-2-3-4.compute-1.amazonaws.com
>>>>>>>>>
>>>>>>>>> listen_address: ec2-2-2-3-4.us-west-1.compute.amazonaws.com
>>>>>>>>>
>>>>>>>>> rpc_address: 0.0.0.0
>>>>>>>>>
>>>>>>>>> I first fire the seed node (east region instance) and it comes up
>>>>>>>>>
>>>>>>>>> without issues.
>>>>>>>>>
>>>>>>>>> When I fire the non-seed node (west region instance) it fails after
>>>>>>>>>
>>>>>>>>> sometime with the error:
>>>>>>>>>
>>>>>>>>> DEBUG 15:09:08,844 Created HHOM instance, registered MBean.
>>>>>>>>>
>>>>>>>>> INFO 15:09:08,844 Joining: getting load information
>>>>>>>>>
>>>>>>>>> INFO 15:09:08,845 Sleeping 90000 ms to wait for load information...
>>>>>>>>>
>>>>>>>>> DEBUG 15:09:09,822 attempting to connect to
>>>>>>>>>
>>>>>>>>> ec2-1-2-3-4.compute-1.amazonaws.com/1.2.3.4
>>>>>>>>>
>>>>>>>>> DEBUG 15:09:10,825 Disseminating load info ...
>>>>>>>>>
>>>>>>>>> DEBUG 15:10:10,826 Disseminating load info ...
>>>>>>>>>
>>>>>>>>> DEBUG 15:10:38,845 ... got load info
>>>>>>>>>
>>>>>>>>> INFO 15:10:38,845 Joining: getting bootstrap token
>>>>>>>>>
>>>>>>>>> ERROR 15:10:38,847 Exception encountered during startup.
>>>>>>>>>
>>>>>>>>> java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap
>>>>>>>>>
>>>>>>>>>     at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:164)
>>>>>>>>>
>>>>>>>>>     at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:146)
>>>>>>>>>
>>>>>>>>>     at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:141)
>>>>>>>>>
>>>>>>>>>     at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:450)
>>>>>>>>>
>>>>>>>>>     at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.cassandra.service.StorageService.initServer(StorageService.java:404)
>>>>>>>>>
>>>>>>>>>     at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:192)
>>>>>>>>>
>>>>>>>>>     at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314)
>>>>>>>>>
>>>>>>>>>     at
>>>>>>>>>
>>>>>>>>> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The seed node seems to somewhat acknowledge the non-seed node:
>>>>>>>>>
>>>>>>>>> attempting to connect to /2.2.3.4
>>>>>>>>>
>>>>>>>>> attempting to connect to /10.170.190.31
>>>>>>>>>
>>>>>>>>> Can you suggest how can I fix it (I did see a few threads on similar
>>>>>>>>>
>>>>>>>>> issue but did not really follow the chain)
>>>>>>>>>
>>>>>>>>> Thanks, AJ
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> <cassec2regions.patch>
>>>
>>
>>
>



-- 
Sasha Dolgy
sasha.do...@gmail.com

Reply via email to