Re: advice for EC2 deployment

aaron morton Mon, 25 Apr 2011 22:14:55 -0700

For background see this article:
http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers


And this recent discussion 
http://www.mail-archive.com/user@cassandra.apache.org/msg12502.html

Issues that may be a concern:
- lots of cross AZ latency in us-east, e.g. LOCAL_QUORUM ops must wait cross AZ 
. Also consider it during maintenance tasks, how much of a pain is it going to 
be to have latency between every node.   
- IMHO not having sufficient (by that I mean 3) replicas in a cassandra DC to 
handle a single node failure when working at Quorum reduces the utility of the 
DC. e.g. with a local RF of 2 in the west, the quorum is 2, and if you lose one 
node from the replica set you will not be able to use local QUORUM for keys in 
that range. Or consider a failure mode where the west is disconnected from the 
east.

Could you start simple with 3 replicas in one AZ in us-east and 3 replicas in 
an AZ+Region ?  Then work through some failure scenarios.  

Hope that helps. 
Aaron
  

On 22 Apr 2011, at 03:28, William Oberman wrote:

> Hi,
> 
> My service is not yet ready to be fully multi-DC, due to how some of my 
> legacy MySQL stuff works.  But, I wanted to get cassandra going ASAP and work 
> towards multi-DC.  I have two main cassandra use cases: one where I can 
> handle eventual consistency (and all of the writes/reads are currently ONE), 
> and one where I can't (writes/reads are currently QUORUM).  My test cluster 
> is currently 4 smalls all in us-east with RF=3 (more to prove I can 
> clustering, than to have an exact production replica).  All of my unit tests, 
> and "load tests" (again, not to prove true max load, but to more to tease out 
> concurrency issues) are passing now.
> 
> For production, I was thinking of doing:
> -4 cassandra larges in us-east (where I am now), once in each AZ
> -1 cassandra large in us-west (where I have nothing)
> For now, my data can fit into a single large's 2 disk ephemeral using RAID0, 
> and I was then thinking of doing a RF=3 with us-east=2 and us-west=1.  If I 
> do eventual consistency at ONE, and consistency at LOCAL_QUORUM, I was hoping:
> -eventual consistency ops would be really fast
> -consistent ops would be pretty fast (what does LOCAL_QUORUM do in this case? 
>  return after 1 or 2 us-east nodes ack?)
> -us-west would contain a complete copy of my data, so it's a good eventually 
> consistent "close to real time" backup  (assuming it can keep up over long 
> periods of time, but I think it should)
> -eventually, when I'm ready to roll out in us-west I'll be able to change the 
> replication settings and that server in us-west could help seed new cassandra 
> instances faster than the ones in us-east
> 
> Or am I missing something really fundamental about how cassandra works making 
> this a terrible plan?  I should have plenty of time to get my multi-DC 
> working before the instance in us-west fills up (but even then, I should be 
> able to add instances over there to stall fairly trivially, right?).
> 
> Thanks!
> 
> will

Re: advice for EC2 deployment

Reply via email to