All things being equal (which is arrrrtarded[0] because I don't know squat about your use case), I would go with choice #2. Main concern would be multiple failures before read repair could be executed via #1. Depending on data size this could take a long time.
Summary: 1. Potential data loss due to multiple failures (node or a raid 0 disk). Initially your data will have replicas on other nodes. Read repair would have to be done to push your data back to where it should be before another loss occurred. Multiple loss = mega bad. 2. + Easy recovery / - Less performant. Unfortunately what the compute nodes buy you are fast ephemeral disk. What you need is fast persisted disk. This, imho, is not a good match for a database system. Re. EBS - do not use multiple availability zones. Keep all your instances in the same location as your disks. If you want multi datacenter redundancy either write it yourself or buy it from Basho. Relying on amazon to do this for you is a bad move in this use case. Riak is in production on EC2, I'm sure the Basho guys will talk about what they can and offer best practices. Keep us posted! -alexander [0]Retarded pirate. Sent from my Verizon Wireless BlackBerry -----Original Message----- From: David Dawson <david.daw...@gmail.com> Sender: riak-users-boun...@lists.basho.com Date: Wed, 30 Mar 2011 17:29:31 To: <riak-users@lists.basho.com> Subject: EC2 and RIAK _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com