Hello,

For an 8-node Cassandra 2.2.11 cluster which is spread over two
datacenters, I am looking to change from using private IP addresses to the
combination of private and public IP addresses and interfaces. The cluster
uses the GossipingPropertyFileSnitch.

The aim is to end up with each peer being known by their public IP address,
but communicating over the private network where possible. We are looking
to make this change in rolling fashion, so without taking down the entire
cluster.

To this end, I have made the following configuration changes:

   1. cassandra-rackdc.properties: add the "prefer_local=true" option
   2. cassandra.yaml:
   - change the broadcast_address from the private address to the public
   address
   - add "listen_on_broadcast_address: true"
   - leave the "listen_address" to be the private IP address
   - change the broadcast_rpc_address from the private address to the
   public address
   3. infrastructure:
   - amend firewall settings to allow TCP traffic between peers on all
   interfaces for the relevant ports

Upon trying to make the described changes on a non-seed node, the following
happens:
- The node appears to start up normally
- Upon running nodetool status on the changed node, all peers appear to be
down, except the local node:

Datacenter: DC-A
=================
DN  10.x.x.x
DN  10.x.x.x
DN  10.x.x.x
DN  10.x.x.x
DN  10.x.x.x

Datacenter: DC-B
=================
DN  10.x.x.x
DN  10.x.x.x
UN  123.x.x.x

- Upon running nodetool status on any other node, all peers appear to be
up, except the changed node which is still known under its old IP address:

Datacenter: DC-A
=================
UN  10.x.x.x
UN  10.x.x.x
UN  10.x.x.x
UN  10.x.x.x
UN  10.x.x.x

Datacenter: DC-B
=================
UN  10.x.x.x
UN  10.x.x.x
DN  10.x.x.x

Reverting the changes in cassandra.yaml and restarting that node causes the
cluster to go back to normal. I have tried various combinations of
private/public IP address settings, all to no avail.

I have successfully set up a similar configuration in a test cluster, but I
have had to bring down the entire cluster in order to get it to work.

My question is: is it possible to make such a change in a phased way
without bringing down the entire cluster? If yes, what is the best approach?

Many thanks in advance.

Thomas



-- 

Thomas Goossens
CTO

[image: Drillster]

Rijnzathe 16
3454 PV De Meern
Netherlands
+31 88 375 0500 <+31%2088%20375%200500>
www.drillster.com

Reply via email to