On Wed, May 9, 2012 at 12:38 AM, Andrew Thompson <and...@hijacked.us> wrote: > > > Does the approximately 1 ms of latency between av zones affect Riak's > > performance that much? > > If the latency is *guranteed* to be that low, then you should be ok, > although I'm not sure how the networking works across zones. If the > latency can do crazy things in outage conditions, you'll stand a decent > chance of screwing the cluster. A downed node is better than a really, > really slow one. >
Well, one thing about AWS is that nothing is guaranteed. I have seen latency spike up to 10 ms between zones, but it's brief. The zones may or may not be in the same building, but they are close together and share the same 10.x.x.x space. Amazon currently charges 1ยข/GB for transfer between zones, so there's obviously some network constraints between them compared to machines inside a zone. > > We were planning to run across av zones for fault tolerance, just beefing > > up single nodes for the moment until rack awareness is available. So the > > recommended solution is to use EDS to accomplish this? > > I'm not sure what you're describing here. > Basically, we were are planning to run a single 3 node cluster, with 1 node in each av zone. We use this technique with a 3 node Galera cluster (synchronous MySQL replication). Galera handles a disappearing node very well, so if an av zone starts acting up the remaining machines continue working. We run all our instance types in multiple zones so we can handle an av zone going down. >From what you're describing, Riak/Erlang doesn't handle a flaky node/network well, so some manual intervention would be needed in the case a node/network starts acting funny. Because Riak doesn't offer rack awareness (we could treat each av zone as a rack), and we still want copies of our data in multiple zones, our only option to ensure live data is replicated in all the zones (for high availability) is to set the number of replicas equal to the number of nodes. We'll be fine until we outgrow the largest EC2 instance type. Is rack awareness a planned feature? If so, when (ballpark) is it planned for? Actually, its worse than that because of some legacy behaviour. EDS > wants to know the bind IP, not a hostname, and it will exchange node IPs > with the other side of the connection, so internal IPs can 'leak' to the > other cluster and cause connection problems. There is a workaround for > this, and I do plan to address it. > I suppose the interim solution for EDS across EC2 regions is to use a VPC in each region and use unique 10.x.x.x subnets in each and VPN between them. But for us, we're not at the point of deploying to multiple regions yet, so no need to dig more into this at the moment. Thank you for answering my questions. This really helps! -Mark
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com