there are some other knock on issues too. the SSL work that has been done would also have to be changed ...
-sd On Tue, Mar 22, 2011 at 6:58 PM, A J <s5a...@gmail.com> wrote: > Milind, > Among the limitation you might want to add that 'nodetool repair' does > not work with this patch. > I tried several times and the repair hangs. > When I run it directly on the trunk of 0.7.4 (without the patch) it > completes successfully within reasonable time. > > Thanks. > > On Tue, Mar 22, 2011 at 1:07 PM, Jeremy Hanna > <jeremy.hanna1...@gmail.com> wrote: >> Never mind - I had thought it was more generalizable but since it's just >> going against the public IP between regions, that's not going to be >> something that makes it into trunk. I had just wanted to see if there was a >> way that it could be done, but it sounds like since amazon doesn't provide >> decent information between regions, something like this workaround patch is >> required. >> >> Anyway - thanks for the work on this. >> >> On Mar 22, 2011, at 8:33 AM, Jeremy Hanna wrote: >> >>> Milind, >>> >>> Thank you for attaching the patch here, but it would be really nice if you >>> could create a jira account so you could participate in the discussion on >>> the ticket and put the patch on there - that is the way people license >>> their contributions with the apache 2 license. You just need to create an >>> account with the public jira inked off of the ticket at the top. >>> >>> Understandable that it would necessarily be a general solution now - but >>> it's a start to understanding what would need to be done so that if >>> possible, something general could be derived. I'm just trying to help get >>> the discussion started so it could be something that people could do out of >>> the box. Not only that, but also so that it could be tested and evolve >>> with the codebase so that people could know that it is hardened and used by >>> others. >>> >>> Any limitations would be nice to note when you attach the patch to the >>> ticket. >>> >>> Thanks so much for your work on this! >>> >>> Jeremy >>> >>> On Mar 21, 2011, at 11:29 PM, Milind Parikh wrote: >>> >>>> Patch is attached... I don't have access to Jira. >>>> >>>> A cautionery note: This is NOT a general solution and is not intended as >>>> such. It could be included as a part of larger patch. I will explain in >>>> the limitation sections about why it is not a general solution; as I find >>>> time. >>>> >>>> Regards >>>> Milind >>>> >>>> On Mon, Mar 21, 2011 at 11:42 PM, Jeremy Hanna >>>> <jeremy.hanna1...@gmail.com> wrote: >>>> Sorry if I was presumptuous earlier. I created a ticket so that the patch >>>> could be submitted and reviewed - that is if it can be generalized so that >>>> it works across regions and doesn't adversely affect the common case. >>>> https://issues.apache.org/jira/browse/CASSANDRA-2362 >>>> >>>> On Mar 21, 2011, at 10:41 PM, Jeremy Hanna wrote: >>>> >>>>> Sorry if I was presumptuous earlier. I created a ticket so that the >>>>> patch could be submitted and reviewed - that is if it can be generalized >>>>> so that it works across regions and doesn't adversely affect the common >>>>> case. >>>>> https://issues.apache.org/jira/browse/CASSANDRA-2362 >>>>> >>>>> On Mar 21, 2011, at 12:20 PM, Jeremy Hanna wrote: >>>>> >>>>>> I talked to Matt Dennis in the channel about it and I think everyone >>>>>> would like to make sure that cassandra works great across multiple >>>>>> regions. He sounded like he didn't know why it wouldn't work after >>>>>> having looked at the patches. I would like to try it both ways - with >>>>>> and without the patches later today if I can and I'd like to help out >>>>>> with getting it working out of the box. >>>>>> >>>>>> Thanks for the investigative work and documentation Milind! >>>>>> >>>>>> Jeremy >>>>>> >>>>>> On Mar 21, 2011, at 12:12 PM, Dave Viner wrote: >>>>>> >>>>>>> Hi Milind, >>>>>>> >>>>>>> Great work here. Can you provide the patch against the 2 files? >>>>>>> >>>>>>> Perhaps there's some way to incorporate it into the trunk of cassandra >>>>>>> so that this is feasible (in a future release) without patching the >>>>>>> source code. >>>>>>> >>>>>>> Dave Viner >>>>>>> >>>>>>> >>>>>>> On Mon, Mar 21, 2011 at 9:41 AM, A J <s5a...@gmail.com> wrote: >>>>>>> Thanks for sharing the document, Milind ! >>>>>>> Followed the instructions and it worked for me. >>>>>>> >>>>>>> On Mon, Mar 21, 2011 at 5:01 AM, Milind Parikh <milindpar...@gmail.com> >>>>>>> wrote: >>>>>>>> Here's the document on Cassandra (0.7.4) across EC2 regions. Clearly >>>>>>>> this is >>>>>>>> work in progress.... but wanted to share what I have. PDF is the >>>>>>>> working >>>>>>>> copy. >>>>>>>> >>>>>>>> >>>>>>>> https://docs.google.com/document/d/175duUNIx7m5mCDa2sjXVI04ekyMa5bdiWdu-AFgisaY/edit?hl=en >>>>>>>> >>>>>>>> On Sun, Mar 20, 2011 at 7:49 PM, aaron morton <aa...@thelastpickle.com> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Recent discussion on the dev list >>>>>>>>> http://www.mail-archive.com/dev@cassandra.apache.org/msg01832.html >>>>>>>>> Aaron >>>>>>>>> On 19 Mar 2011, at 06:46, A J wrote: >>>>>>>>> >>>>>>>>> Just to add, all the telnet (port 7000) and cassandra-cli (port 9160) >>>>>>>>> connections are done using the public DNS (that goes like >>>>>>>>> ec2-.....compute.amazonaws.com) >>>>>>>>> >>>>>>>>> On Fri, Mar 18, 2011 at 1:37 PM, A J <s5a...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> I am able to telnet from one region to another on 7000 port without >>>>>>>>> >>>>>>>>> issues. (I get the expected Connected to .....Escape character is >>>>>>>>> >>>>>>>>> '^]'.) >>>>>>>>> >>>>>>>>> Also I am able to execute cassandra client on 9160 port from one >>>>>>>>> >>>>>>>>> region to another without issues (this is when I run cassandra >>>>>>>>> >>>>>>>>> separately on each region without forming a cluster). >>>>>>>>> >>>>>>>>> So I think the ports 7000 and 9160 are not the issue. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Mar 18, 2011 at 1:26 PM, Dave Viner <davevi...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> From the us-west instance, are you able to connect to the us-east >>>>>>>>> instance >>>>>>>>> >>>>>>>>> using telnet on port 7000 and 9160? >>>>>>>>> >>>>>>>>> If not, then you need to open those ports for communication (via your >>>>>>>>> >>>>>>>>> Security Group) >>>>>>>>> >>>>>>>>> Dave Viner >>>>>>>>> >>>>>>>>> On Fri, Mar 18, 2011 at 10:20 AM, A J <s5a...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> Thats exactly what I am doing. >>>>>>>>> >>>>>>>>> I was able to do the first two scenarios without any issues (i.e. 2 >>>>>>>>> >>>>>>>>> nodes in same availability zone. Followed by an additional node in a >>>>>>>>> >>>>>>>>> different zone but same region) >>>>>>>>> >>>>>>>>> I am stuck at the third scenario of separate regions. >>>>>>>>> >>>>>>>>> (I did read the "Cassandra nodes on EC2 in two different regions not >>>>>>>>> >>>>>>>>> communicating" thread but it did not seem to end with resolution) >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Mar 18, 2011 at 1:15 PM, Dave Viner <davevi...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hi AJ, >>>>>>>>> >>>>>>>>> I'd suggest getting to a multi-region cluster step-by-step. First, >>>>>>>>> get >>>>>>>>> >>>>>>>>> 2 >>>>>>>>> >>>>>>>>> nodes running in the same availability zone. Make sure that works >>>>>>>>> >>>>>>>>> properly. >>>>>>>>> >>>>>>>>> Second, add a node in a separate availability zone, but in the same >>>>>>>>> >>>>>>>>> region. >>>>>>>>> >>>>>>>>> Make sure that's working properly. Third, add a node that's in a >>>>>>>>> >>>>>>>>> separate >>>>>>>>> >>>>>>>>> region. >>>>>>>>> >>>>>>>>> Taking it step-by-step will ensure that any issues are specific to the >>>>>>>>> >>>>>>>>> region-to-region communication, rather than intra-zone connectivity or >>>>>>>>> >>>>>>>>> cassandra cluster configuration. >>>>>>>>> >>>>>>>>> Dave Viner >>>>>>>>> >>>>>>>>> On Fri, Mar 18, 2011 at 8:34 AM, A J <s5a...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> I am trying to setup a cassandra cluster across regions. >>>>>>>>> >>>>>>>>> For testing I am keeping it simple and just having one node in US-EAST >>>>>>>>> >>>>>>>>> (say ec2-1-2-3-4.compute-1.amazonaws.com) and one node in US-WEST (say >>>>>>>>> >>>>>>>>> ec2-2-2-3-4.us-west-1.compute.amazonaws.com). >>>>>>>>> >>>>>>>>> Using Cassandra 0.7.4 >>>>>>>>> >>>>>>>>> >>>>>>>>> The one in east region is the seed node and has the values as: >>>>>>>>> >>>>>>>>> auto_bootstrap: false >>>>>>>>> >>>>>>>>> seeds: ec2-1-2-3-4.compute-1.amazonaws.com >>>>>>>>> >>>>>>>>> listen_address: ec2-1-2-3-4.compute-1.amazonaws.com >>>>>>>>> >>>>>>>>> rpc_address: 0.0.0.0 >>>>>>>>> >>>>>>>>> The one in west region is non seed and has the values as: >>>>>>>>> >>>>>>>>> auto_bootstrap: true >>>>>>>>> >>>>>>>>> seeds: ec2-1-2-3-4.compute-1.amazonaws.com >>>>>>>>> >>>>>>>>> listen_address: ec2-2-2-3-4.us-west-1.compute.amazonaws.com >>>>>>>>> >>>>>>>>> rpc_address: 0.0.0.0 >>>>>>>>> >>>>>>>>> I first fire the seed node (east region instance) and it comes up >>>>>>>>> >>>>>>>>> without issues. >>>>>>>>> >>>>>>>>> When I fire the non-seed node (west region instance) it fails after >>>>>>>>> >>>>>>>>> sometime with the error: >>>>>>>>> >>>>>>>>> DEBUG 15:09:08,844 Created HHOM instance, registered MBean. >>>>>>>>> >>>>>>>>> INFO 15:09:08,844 Joining: getting load information >>>>>>>>> >>>>>>>>> INFO 15:09:08,845 Sleeping 90000 ms to wait for load information... >>>>>>>>> >>>>>>>>> DEBUG 15:09:09,822 attempting to connect to >>>>>>>>> >>>>>>>>> ec2-1-2-3-4.compute-1.amazonaws.com/1.2.3.4 >>>>>>>>> >>>>>>>>> DEBUG 15:09:10,825 Disseminating load info ... >>>>>>>>> >>>>>>>>> DEBUG 15:10:10,826 Disseminating load info ... >>>>>>>>> >>>>>>>>> DEBUG 15:10:38,845 ... got load info >>>>>>>>> >>>>>>>>> INFO 15:10:38,845 Joining: getting bootstrap token >>>>>>>>> >>>>>>>>> ERROR 15:10:38,847 Exception encountered during startup. >>>>>>>>> >>>>>>>>> java.lang.RuntimeException: No other nodes seen! Unable to bootstrap >>>>>>>>> >>>>>>>>> at >>>>>>>>> >>>>>>>>> >>>>>>>>> org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:164) >>>>>>>>> >>>>>>>>> at >>>>>>>>> >>>>>>>>> >>>>>>>>> org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:146) >>>>>>>>> >>>>>>>>> at >>>>>>>>> >>>>>>>>> >>>>>>>>> org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:141) >>>>>>>>> >>>>>>>>> at >>>>>>>>> >>>>>>>>> >>>>>>>>> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:450) >>>>>>>>> >>>>>>>>> at >>>>>>>>> >>>>>>>>> >>>>>>>>> org.apache.cassandra.service.StorageService.initServer(StorageService.java:404) >>>>>>>>> >>>>>>>>> at >>>>>>>>> >>>>>>>>> >>>>>>>>> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:192) >>>>>>>>> >>>>>>>>> at >>>>>>>>> >>>>>>>>> >>>>>>>>> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314) >>>>>>>>> >>>>>>>>> at >>>>>>>>> >>>>>>>>> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79) >>>>>>>>> >>>>>>>>> >>>>>>>>> The seed node seems to somewhat acknowledge the non-seed node: >>>>>>>>> >>>>>>>>> attempting to connect to /2.2.3.4 >>>>>>>>> >>>>>>>>> attempting to connect to /10.170.190.31 >>>>>>>>> >>>>>>>>> Can you suggest how can I fix it (I did see a few threads on similar >>>>>>>>> >>>>>>>>> issue but did not really follow the chain) >>>>>>>>> >>>>>>>>> Thanks, AJ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >>>> <cassec2regions.patch> >>> >> >> > -- Sasha Dolgy sasha.do...@gmail.com