Hey Andrew, Can you elaborate on how EDS replication does this mirroring? Does each vnode have the ability to connect to the other cluster, or is there a coordinator that sends data to the other cluster, etc?
Thanks, Ahmed On Mon, May 7, 2012 at 6:25 PM, Andrew Thompson <and...@hijacked.us> wrote: > On Tue, May 01, 2012 at 03:49:02PM -0400, Mark Rose wrote: >> I've got some questions about Riak Enterprise I haven't been able to find >> the answers to. > > Hi Mark, I'm the riak EDS 'maintainer'. Sorry I didn't reply earlier, I > was travelling all week. > >> I understand that the open source version of Riak's replication is designed >> for single data center usage only, but I'm unsure about how Riak Entreprise >> handles replication. Specifically, I'm curious about locality and high >> availability. >> >> Our setup is already running in multiple availability zones on EC2. We're >> running Galera across the zones to provide both redundancy and a local copy >> of the data to avoid the network latency of going to another zone. However, >> Galera, as nice as it is, doesn't scale writes. We're going to be using >> Riak to store a lot of information going forward, and may eventually move >> our existing data to it as well. >> >> The only thing holding us back from going to multiple regions on Amazon is >> our datastore. >> >> How well does Riak handle layered topologies, such as EC2? >> >> Is it possible to configure Riak Enterprise to store two copies of the data >> in each EC2 region, ensuring that the two copies are in different zones >> when there are more than one Riak servers in a zone? > > Current EDS replication is pretty simple, it will just try to > (eventually) ensure that data on one cluster is mirrored on another. It > won't forward reads and riak doesn't have anything like 'rack > awareness', at least not yet. > >> When a query is run, is it run in one region only? Would Riak prefer copies >> of the data in the local zone? > > Riak only queries the local cluster, yes. >> >> For what it's worth, our current datastore load is roughly half and half >> writes and reads. We heavily cache reads with memcache (99%). We may drop >> memcache if reads on Riak prove fast enough (thus avoiding the issues of >> invalidating remote caches). > > Given the current limitations, you'd probably be best off with N > clusters in different regions and/or zones. Don't try to span a single > cluster across a zone, or even worse, a region. Then hook them together > with replication. > > There's also some fun with NAT on EC2, but it can be made to work. > > Let me know if that helps, > > Andrew > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com