We're in the process of migrating our webapps to Rails 3 (some from PHP to Rails 3 even). I'm the one responsible for our new Amazon EC2 setup.
We've been looking at new ways of storing our data (quite alot of files but also data which was previously in a MySQL DB). So, I've been trying out MongoDB for a while and quite liked it, though we've had some doubts on the security of our data among other things. The ease of replication in Riak made me look at it closer. I really like how replication is in the core of Riak and how data is just data regardless of what it consists of (files, json, xml whatever). I'm also looking at Riak since we've discussed saving things to S3 but here Riak could be used instead I think. One thing that is very important to us is the security of all communication within our cluster, so I'm wondering how the riak nodes exchange data between each other - is that encrypted or could someone potentially listen in on it? Could we easily encrypt it through stunnel (as we've done with some other protocols) and would that affect the performance of Riak alot? We're not entirely sure whether we want or need replication between EC2 US and EC2 Europe (and possibly Asia as well) but if we do - would Riak work well over such distances? Is Riak suited to run on XEN VMs like on EC2? What would be a recommended instance size (i.e does Riak require LOTS of ram and processors?) Anyone have experience with Riak running on EC2? So far, most of the files we store are somewhere between 500 k and 50 megs but might get larger than that as we expand our business to many more clients with different needs. I know there's supposed to be a limit around 50 megs currently... Is this being actively worked on? Backups... well, we've done all kinds of 'em. The latest, and in my opinion most straightforward, is to put all data on an XFS volume and freeze it (incl. "FLUSH TABLES WITH READ LOCK" on mysql) and then just do an EBS snapshot of the volume, how would something like that apply to Riak (even though the data might be safer in a Riak cluster). If we run a cluster of say 10 nodes - how would we get back to a point in time? And if a DELETE was done erroneously, how could we get that data back? Is riak-admin backup really an option (i.e backing up the entire cluster) if the dataset is very large? If we do restore using such data would the whole cluster get "reset" to that point in time? One of the things I'm quite excited about is how we could get away with nodes that are exactly alike and load-balancing between them - each having a riak node running so we could have our app and the riak node on the same machine and just add more such instances if we need to. Today we've separated clients onto different databases and machines (NOT load-balanced and NOT replicated). This has worked ok but I think having a load-balanced cluster would make things easier and configuration of machines simpler (we use chef so it's not THAT bad today). Now we add clients by adding a CNAME entry in DNS for a specific machine/ip - adding configuration and databases using nanite but much of that could perhaps disappear with a single load-balanced cluster I think... Thanks for taking your time to look at my questions! Kind regards, John _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com