Re: Cassandra at Amazon AWS

2013-01-17 Thread Marcelo Elias Del Valle
Everyone, thanks a lot for the answer, they helped me a lot. 2013/1/17 Andrey Ilinykh > I'd recommend Priam. > > http://techblog.netflix.com/2012/02/announcing-priam.html > > Andrey > > > On Thu, Jan 17, 2013 at 5:44 AM, Adam Venturella wrote: > >> Jared, how do you guys handle data backups for

Re: Cassandra at Amazon AWS

2013-01-17 Thread Andrey Ilinykh
I'd recommend Priam. http://techblog.netflix.com/2012/02/announcing-priam.html Andrey On Thu, Jan 17, 2013 at 5:44 AM, Adam Venturella wrote: > Jared, how do you guys handle data backups for your ephemeral based > cluster? > > I'm trying to move to ephemeral drives myself, and that was my last

Re: Cassandra at Amazon AWS

2013-01-17 Thread Jared Biel
We use a replication factor such that if any one instance dies the cluster would remain alive. If a node dies, we simply replace it and move on. As far as disaster recovery, it's easy to store snapshots in S3, although glacier is looking interesting. Jared Biel System Administrator Bolder Thinking

Re: Cassandra at Amazon AWS

2013-01-17 Thread William Oberman
I have a "peer EBS disk" to the ephemeral disk . Then I do nodetool snapshot -> rsync from ephemeral to EBS -> take snapshot of EBS. Syncing nodetool snapshot directly to S3 would involve less steps and be cheaper (EBS costs more than S3), but I do post processing on the snapshot for EMR, and it

Re: Cassandra at Amazon AWS

2013-01-17 Thread Adam Venturella
Jared, how do you guys handle data backups for your ephemeral based cluster? I'm trying to move to ephemeral drives myself, and that was my last sticking point; asking how others in the community deal with backup in case the VM explodes. On Wed, Jan 16, 2013 at 1:21 PM, Jared Biel wrote: > We'

Re: Cassandra at Amazon AWS

2013-01-16 Thread Jared Biel
We're currently using Cassandra on EC2 at very low scale (a 2 node cluster on m1.large instances in two regions.) I don't believe that EBS is recommended for performance reasons. Also, it's proven to be very unreliable in the past (most of the big/notable AWS outages were due to EBS issues.) We've

Re: Cassandra at Amazon AWS

2013-01-16 Thread Andrey Ilinykh
Storage size is not a problem, you always can add more nodes. Anyway, it is not recommended to have nodes with more then 500G (compaction, repair take forever). EC2 m1.large has 800G of ephemeral storage, EC2 m1.xlarge 1.6T. I'd recommend xlarge, it has 4 CPUs, so maintenance procedures don't affec

Re: Cassandra at Amazon AWS

2013-01-16 Thread Ben Chobot
We use cassandra on ephemeral drives. Yes, that means we need more nodes to hold more data, but doesn't that play into cassandra's strengths? It sounds like you're trying to vertically scale your cassandra cluster. On Jan 16, 2013, at 12:42 PM, Marcelo Elias Del Valle wrote: > Hello, > >I