> So I guess we have to switch to Ec2MultiRegionSnitch. It depends on how you are connecting the regions. If the nodes can directly communicate with each other, say through a VPN, you may not need to change it. If they are behind a NAT you will need to use it.
When you change the snitch test first, make sure that nodes do not change their DC or rack. There are potential problems changing from the PropertyFileSnitch but they will not affect you. > Our C* cluster : C*1.2.2, 6 EC2 m1.xLarge in eu-west already running, wanting > to add 3 m1.xLarge on us-east I recommend using the same number of nodes in both DC's. > 1/ Change the yaml conf on each of the 6 eu-west existing nodes > - Ec2Snitch to Ec2MultiRegionSnitch > - uncomment the broadcast_address and set the public ip of the node > - let the private ip as defined right now the listen_address > - switch seeds from private to public IP Sounds about right, remember to test and make limited changes. You may also want to enable SSL (see yaml) and/or usine a VPN or VPC between the DC's http://www.datastax.com/docs/1.0/cluster_architecture/replication > 4/ > - Add 3 nodes one by one with auto_bootstrap set to true. > 5/ > - Repair nodes (one by one) > - Cleanup nodes (one by one) Make sure the code is using LOCAL_QUOURM Add the nodes (I recommend 6) with auto_bootstrap: false added to the yaml. update the keyspace replication strategy to add rf:3 for the new DC. Use nodetool rebuild on the new nodes to rebuild them from the us-west DC. You do not need to use cleanup, data is not moving in the original DC. The two DC's each have copies. > a/ Do I have to move the tokens since I don't use vnodes yet ? How should I > position all these nodes ? I prefer to use the offset method. Take the 6 tokens from your us-west DC and add 100 to them for the new dc. > b/ Is it useful to add a seed from the new us-east data center in the yaml of > all nodes ? Yes. Have 3 from each. > c/ I am using the SimpleStrategy. Is it worth it/mandatory to change this > strategy when using multiple DC ? Yes. You *MUST* change this, otherwise your code will have to wait for cross DC latency and you will not be able to use the LOCAL_ or EACH_ CL levels. You need to do this first. There is some information out there on doing this. A change like this can result in data going missing, so do some testing. If all you nodes in us-west are in the same AZ (the same Cassandra rack) then you can make the change to NTS without an impact. If not it's going to be tricky. > d/ With my 2 DC will I have 3 RF per DC or cross DC ? Use the NTS and have RF 3 in each DC http://www.datastax.com/docs/1.1/cluster_architecture/replication#replication-strategy > e/ Should I configure my C* client to use the C* nodes from their region as > coordinators (which seems to me the good way) or should I configure all the > servers everywhere ? use the local nodes only. First thing is to update the replication strategy and get the code using local_quorum. Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 19/04/2013, at 2:41 AM, Alain RODRIGUEZ <arodr...@gmail.com> wrote: > Hi, > > The company I work for is having so much success that we are expanding > worldwide :). We have to deploy our Cassandra servers worldwide too in order > to improve the latency of our new abroad customers. > > I am wondering about the process to grow from one data center to a few of > them. First thing is we use EC2Snitch for now. So I guess we have to switch > to Ec2MultiRegionSnitch. > > Is that doable without any down-time ? > > Our C* cluster : C*1.2.2, 6 EC2 m1.xLarge in eu-west already running, wanting > to add 3 m1.xLarge on us-east > > I was planning to do it this way: > > 1/ Change the yaml conf on each of the 6 eu-west existing nodes > - Ec2Snitch to Ec2MultiRegionSnitch > - uncomment the broadcast_address and set the public ip of the node > - let the private ip as defined right now the listen_address > - switch seeds from private to public IP > 2/ Rolling restart > - nodetool disablegossip > - nodetool disablethrift > - nodetool drain > - rm /path/cassandra/commitlog/* ? (I used to do it since drain was > broken to avoid replaying counters logs, leading to overcounts, not sure how > pertinent this is nowadays) > - service cassandra stop > - service cassandra start > 3/ > - Make sure everything is still running smoothly in eu-west servers > 4/ > - Add 3 nodes one by one with auto_bootstrap set to true. > 5/ > - Repair nodes (one by one) > - Cleanup nodes (one by one) > > > Questions : > > a/ Do I have to move the tokens since I don't use vnodes yet ? How should I > position all these nodes ? > b/ Is it useful to add a seed from the new us-east data center in the yaml of > all nodes ? > c/ I am using the SimpleStrategy. Is it worth it/mandatory to change this > strategy when using multiple DC ? > d/ With my 2 DC will I have 3 RF per DC or cross DC ? > e/ Should I configure my C* client to use the C* nodes from their region as > coordinators (which seems to me the good way) or should I configure all the > servers everywhere ? > > Any comment on the process described above would be appreciated, specially if > you are arguing that something is wrong about it. > > If you find out I am missing something, I will be glad to hear about it. > > Alain >