What you're describing depends on the load (data size) and latency. Doing a bootstrap or backup would require a fair amount of bandwidth if you want it done quickly with a lot of data. Also, latency would be very high going over some kind of office VPN. But there's no reason you can't do what you're describing.
You could setup a test cluster and see what the actual latency is. Most people use 4 nodes per POP with NetworkTopologyStrategy (NTS)for a multi-DC setup with RF=3. Thanks, James Briggs -- Cassandra/MySQL DBA. Available in San Jose area or remote. ________________________________ From: David M <da3bob...@gmail.com> To: user@cassandra.apache.org Sent: Tuesday, September 9, 2014 5:49 PM Subject: cassandra on own distributed network Hi everyone I am at a loss for locating use cases/examples/documentation/books/etc for deploying Cassandra where multi-dc nodes of a single cluster are on your own network at points around the world. In my example a Cassandra dc equates to a building. Of interest to me is how installations are inter-connecting their dcs (circuit bandwidth, latency requirements) for optimal replication/gossip/etc and any lessons learned they can share. I know there isn't going to be a single config that applies to every deployment/usage pattern/etc but surely there are at least loose rules of thumb that will get me going (or maybe alternative deployments). The interesting posts/blogs/books/etc seem to reference Cassandra in the cloud (eg specifying AWS instance types) leaving out descriptions/usage/requirements at the network layer. If anyone knows of any information on this topic that I've missed I'd appreciate your sharing. Thanks, David