A few months ago, there was a thread on this list about using Cassandra across multiple EC2 regions. I was interested in doing in doing the same thing, and managed to make it work.
To implement this, there are basically two things that need to change. First, in storage-conf.xml, I used the "external" IP addresses for <ListenAddress> and <Seed> - these external address are needed for the machines in different regions to talk to each other. However, they also work within regions. However, that doesn't quite work with the stock Cassandra, as it will try to bind and listen on those addresses and give up because they don't appear to be valid network addresses. This patch causes Cassandra to listen on the local network, rather than the <ListenAddress> defined in the config file. (This is not a completely general solution. It assumes that there is only one local network, and that the default network is the one to use, but - at least for EC2 - that assumption should be OK) Part of my motivation for posting here is to solicit feedback on the third part of the patch. I was able to get my two-region cluster up and running by patching just the first two files. The third change may be needed under certain conditions, but I never seemed to hit that code. Here's the source patch: diff -ur orig/apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/MessagingService.java apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/MessagingService.java --- orig/apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/MessagingService.java 2010-08-16 17:48:02.000000000 -0500 +++ apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/MessagingService.java 2010-09-01 10:05:34.000000000 -0500 @@ -147,7 +147,16 @@ ServerSocketChannel serverChannel = ServerSocketChannel.open(); final ServerSocket ss = serverChannel.socket(); ss.setReuseAddress(true); + +/* OLD ss.bind(new InetSocketAddress(localEp, DatabaseDescriptor.getStoragePort())); +*/ + /* In order to allow using Amazon EC2 across regions, we listen + * on our local address, rather rather than the "public" IP address + * defined in storage-conf.xml + */ + ss.bind(new InetSocketAddress(InetAddress.getLocalHost(), DatabaseDescriptor.getStoragePort())); + socketThread = new SocketThread(ss, "ACCEPT-" + localEp); socketThread.start(); listenGate.signalAll(); diff -ur orig/apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/OutboundTcpConnection.java apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/OutboundTcpConnection.java --- orig/apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/OutboundTcpConnection.java 2010-07-27 16:09:18.000000000 -0500 +++ apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/OutboundTcpConnection.java 2010-09-01 10:09:31.000000000 -0500 @@ -149,7 +149,16 @@ try { // zero means 'bind on any available port.' + + /* In order to allow using Amazon EC2 across regions, we + * listen on our local address, rather rather than the + * "public" IP address defined in storage-conf.xml + */ + +/* OLD socket = new Socket(endpoint, DatabaseDescriptor.getStoragePort(), FBUtilities.getLocalAddress(), 0); +*/ + socket = new Socket(endpoint, DatabaseDescriptor.getStoragePort(), InetAddress.getLocalHost(), 0); socket.setTcpNoDelay(true); output = new DataOutputStream(socket.getOutputStream()); return true; diff -ur orig/apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/FileStreamTask.java apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/FileStreamTask.java --- orig/apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/FileStreamTask.java 2010-05-28 11:23:04.000000000 -0500 +++ apache-cassandra-0.6.5-src/src/java/org/apache/cassandra/net/FileStreamTask.java 2010-09-01 10:07:43.000000000 -0500 @@ -122,6 +122,14 @@ { SocketChannel channel = SocketChannel.open(); // force local binding on correctly specified interface. + + /* When using Amazon EC2 "public" IP addresses, we probably + * won't be able to bind to the address. However, I don't see + * this code getting hit, and I'm not sure under what circumstances + * it would get run. + */ +System.out.println("FIXME - probably can't bind to this address: "+FBUtilities.getLocalAddress()+"\n"); + channel.socket().bind(new InetSocketAddress(FBUtilities.getLocalAddress(), 0)); int attempts = 0; while (true) -- Peter Fales Alcatel-Lucent Member of Technical Staff 1960 Lucent Lane Room: 9H-505 Naperville, IL 60566-7033 Email: peter.fa...@alcatel-lucent.com Phone: 630 979 8031