Milind, Thank you for attaching the patch here, but it would be really nice if you could create a jira account so you could participate in the discussion on the ticket and put the patch on there - that is the way people license their contributions with the apache 2 license. You just need to create an account with the public jira inked off of the ticket at the top.
Understandable that it would necessarily be a general solution now - but it's a start to understanding what would need to be done so that if possible, something general could be derived. I'm just trying to help get the discussion started so it could be something that people could do out of the box. Not only that, but also so that it could be tested and evolve with the codebase so that people could know that it is hardened and used by others. Any limitations would be nice to note when you attach the patch to the ticket. Thanks so much for your work on this! Jeremy On Mar 21, 2011, at 11:29 PM, Milind Parikh wrote: > Patch is attached... I don't have access to Jira. > > A cautionery note: This is NOT a general solution and is not intended as > such. It could be included as a part of larger patch. I will explain in the > limitation sections about why it is not a general solution; as I find time. > > Regards > Milind > > On Mon, Mar 21, 2011 at 11:42 PM, Jeremy Hanna <jeremy.hanna1...@gmail.com> > wrote: > Sorry if I was presumptuous earlier. I created a ticket so that the patch > could be submitted and reviewed - that is if it can be generalized so that it > works across regions and doesn't adversely affect the common case. > https://issues.apache.org/jira/browse/CASSANDRA-2362 > > On Mar 21, 2011, at 10:41 PM, Jeremy Hanna wrote: > > > Sorry if I was presumptuous earlier. I created a ticket so that the patch > > could be submitted and reviewed - that is if it can be generalized so that > > it works across regions and doesn't adversely affect the common case. > > https://issues.apache.org/jira/browse/CASSANDRA-2362 > > > > On Mar 21, 2011, at 12:20 PM, Jeremy Hanna wrote: > > > >> I talked to Matt Dennis in the channel about it and I think everyone would > >> like to make sure that cassandra works great across multiple regions. He > >> sounded like he didn't know why it wouldn't work after having looked at > >> the patches. I would like to try it both ways - with and without the > >> patches later today if I can and I'd like to help out with getting it > >> working out of the box. > >> > >> Thanks for the investigative work and documentation Milind! > >> > >> Jeremy > >> > >> On Mar 21, 2011, at 12:12 PM, Dave Viner wrote: > >> > >>> Hi Milind, > >>> > >>> Great work here. Can you provide the patch against the 2 files? > >>> > >>> Perhaps there's some way to incorporate it into the trunk of cassandra so > >>> that this is feasible (in a future release) without patching the source > >>> code. > >>> > >>> Dave Viner > >>> > >>> > >>> On Mon, Mar 21, 2011 at 9:41 AM, A J <s5a...@gmail.com> wrote: > >>> Thanks for sharing the document, Milind ! > >>> Followed the instructions and it worked for me. > >>> > >>> On Mon, Mar 21, 2011 at 5:01 AM, Milind Parikh <milindpar...@gmail.com> > >>> wrote: > >>>> Here's the document on Cassandra (0.7.4) across EC2 regions. Clearly > >>>> this is > >>>> work in progress.... but wanted to share what I have. PDF is the working > >>>> copy. > >>>> > >>>> > >>>> https://docs.google.com/document/d/175duUNIx7m5mCDa2sjXVI04ekyMa5bdiWdu-AFgisaY/edit?hl=en > >>>> > >>>> On Sun, Mar 20, 2011 at 7:49 PM, aaron morton <aa...@thelastpickle.com> > >>>> wrote: > >>>>> > >>>>> Recent discussion on the dev list > >>>>> http://www.mail-archive.com/dev@cassandra.apache.org/msg01832.html > >>>>> Aaron > >>>>> On 19 Mar 2011, at 06:46, A J wrote: > >>>>> > >>>>> Just to add, all the telnet (port 7000) and cassandra-cli (port 9160) > >>>>> connections are done using the public DNS (that goes like > >>>>> ec2-.....compute.amazonaws.com) > >>>>> > >>>>> On Fri, Mar 18, 2011 at 1:37 PM, A J <s5a...@gmail.com> wrote: > >>>>> > >>>>> I am able to telnet from one region to another on 7000 port without > >>>>> > >>>>> issues. (I get the expected Connected to .....Escape character is > >>>>> > >>>>> '^]'.) > >>>>> > >>>>> Also I am able to execute cassandra client on 9160 port from one > >>>>> > >>>>> region to another without issues (this is when I run cassandra > >>>>> > >>>>> separately on each region without forming a cluster). > >>>>> > >>>>> So I think the ports 7000 and 9160 are not the issue. > >>>>> > >>>>> > >>>>> > >>>>> On Fri, Mar 18, 2011 at 1:26 PM, Dave Viner <davevi...@gmail.com> wrote: > >>>>> > >>>>> From the us-west instance, are you able to connect to the us-east > >>>>> instance > >>>>> > >>>>> using telnet on port 7000 and 9160? > >>>>> > >>>>> If not, then you need to open those ports for communication (via your > >>>>> > >>>>> Security Group) > >>>>> > >>>>> Dave Viner > >>>>> > >>>>> On Fri, Mar 18, 2011 at 10:20 AM, A J <s5a...@gmail.com> wrote: > >>>>> > >>>>> Thats exactly what I am doing. > >>>>> > >>>>> I was able to do the first two scenarios without any issues (i.e. 2 > >>>>> > >>>>> nodes in same availability zone. Followed by an additional node in a > >>>>> > >>>>> different zone but same region) > >>>>> > >>>>> I am stuck at the third scenario of separate regions. > >>>>> > >>>>> (I did read the "Cassandra nodes on EC2 in two different regions not > >>>>> > >>>>> communicating" thread but it did not seem to end with resolution) > >>>>> > >>>>> > >>>>> On Fri, Mar 18, 2011 at 1:15 PM, Dave Viner <davevi...@gmail.com> wrote: > >>>>> > >>>>> Hi AJ, > >>>>> > >>>>> I'd suggest getting to a multi-region cluster step-by-step. First, get > >>>>> > >>>>> 2 > >>>>> > >>>>> nodes running in the same availability zone. Make sure that works > >>>>> > >>>>> properly. > >>>>> > >>>>> Second, add a node in a separate availability zone, but in the same > >>>>> > >>>>> region. > >>>>> > >>>>> Make sure that's working properly. Third, add a node that's in a > >>>>> > >>>>> separate > >>>>> > >>>>> region. > >>>>> > >>>>> Taking it step-by-step will ensure that any issues are specific to the > >>>>> > >>>>> region-to-region communication, rather than intra-zone connectivity or > >>>>> > >>>>> cassandra cluster configuration. > >>>>> > >>>>> Dave Viner > >>>>> > >>>>> On Fri, Mar 18, 2011 at 8:34 AM, A J <s5a...@gmail.com> wrote: > >>>>> > >>>>> Hello, > >>>>> > >>>>> I am trying to setup a cassandra cluster across regions. > >>>>> > >>>>> For testing I am keeping it simple and just having one node in US-EAST > >>>>> > >>>>> (say ec2-1-2-3-4.compute-1.amazonaws.com) and one node in US-WEST (say > >>>>> > >>>>> ec2-2-2-3-4.us-west-1.compute.amazonaws.com). > >>>>> > >>>>> Using Cassandra 0.7.4 > >>>>> > >>>>> > >>>>> The one in east region is the seed node and has the values as: > >>>>> > >>>>> auto_bootstrap: false > >>>>> > >>>>> seeds: ec2-1-2-3-4.compute-1.amazonaws.com > >>>>> > >>>>> listen_address: ec2-1-2-3-4.compute-1.amazonaws.com > >>>>> > >>>>> rpc_address: 0.0.0.0 > >>>>> > >>>>> The one in west region is non seed and has the values as: > >>>>> > >>>>> auto_bootstrap: true > >>>>> > >>>>> seeds: ec2-1-2-3-4.compute-1.amazonaws.com > >>>>> > >>>>> listen_address: ec2-2-2-3-4.us-west-1.compute.amazonaws.com > >>>>> > >>>>> rpc_address: 0.0.0.0 > >>>>> > >>>>> I first fire the seed node (east region instance) and it comes up > >>>>> > >>>>> without issues. > >>>>> > >>>>> When I fire the non-seed node (west region instance) it fails after > >>>>> > >>>>> sometime with the error: > >>>>> > >>>>> DEBUG 15:09:08,844 Created HHOM instance, registered MBean. > >>>>> > >>>>> INFO 15:09:08,844 Joining: getting load information > >>>>> > >>>>> INFO 15:09:08,845 Sleeping 90000 ms to wait for load information... > >>>>> > >>>>> DEBUG 15:09:09,822 attempting to connect to > >>>>> > >>>>> ec2-1-2-3-4.compute-1.amazonaws.com/1.2.3.4 > >>>>> > >>>>> DEBUG 15:09:10,825 Disseminating load info ... > >>>>> > >>>>> DEBUG 15:10:10,826 Disseminating load info ... > >>>>> > >>>>> DEBUG 15:10:38,845 ... got load info > >>>>> > >>>>> INFO 15:10:38,845 Joining: getting bootstrap token > >>>>> > >>>>> ERROR 15:10:38,847 Exception encountered during startup. > >>>>> > >>>>> java.lang.RuntimeException: No other nodes seen! Unable to bootstrap > >>>>> > >>>>> at > >>>>> > >>>>> > >>>>> org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:164) > >>>>> > >>>>> at > >>>>> > >>>>> > >>>>> org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:146) > >>>>> > >>>>> at > >>>>> > >>>>> > >>>>> org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:141) > >>>>> > >>>>> at > >>>>> > >>>>> > >>>>> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:450) > >>>>> > >>>>> at > >>>>> > >>>>> > >>>>> org.apache.cassandra.service.StorageService.initServer(StorageService.java:404) > >>>>> > >>>>> at > >>>>> > >>>>> > >>>>> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:192) > >>>>> > >>>>> at > >>>>> > >>>>> > >>>>> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314) > >>>>> > >>>>> at > >>>>> > >>>>> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79) > >>>>> > >>>>> > >>>>> The seed node seems to somewhat acknowledge the non-seed node: > >>>>> > >>>>> attempting to connect to /2.2.3.4 > >>>>> > >>>>> attempting to connect to /10.170.190.31 > >>>>> > >>>>> Can you suggest how can I fix it (I did see a few threads on similar > >>>>> > >>>>> issue but did not really follow the chain) > >>>>> > >>>>> Thanks, AJ > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>> > >>>> > >>> > >> > > > > > <cassec2regions.patch>