Re: Cassandra in the cloud

2010-06-06 Thread David Boxenhorn
Thanks everybody. This your advice will be carefully considered in our decision making. On Fri, Jun 4, 2010 at 1:46 AM, Ben Standefer wrote: > Mike, yep, there are a lot of benchmarks proving it (plus it just makes > sense) > > http://stu.mp/2009/12/disk-io-and-throughput-benchmarks-on-amazons-e

Re: Cassandra in the cloud

2010-06-03 Thread Ben Standefer
Mike, yep, there are a lot of benchmarks proving it (plus it just makes sense) http://stu.mp/2009/12/disk-io-and-throughput-benchmarks-on-amazons-ec2.html http://www.mysqlperformanceblog.com/2009/08/06/ec2ebs-single-and-raid-volumes-io-bencmark/ http://orion.heroku.com/past/2009/7/29/io_performanc

Re: Cassandra in the cloud

2010-06-03 Thread Mike Subelsky
Ben, thanks for that, we may try that. I did find an AWS forum tidbit from two years ago: "4 ephemeral stores striped together can give significantly higher throughput for sequential writes than EBS." http://developer.amazonwebservices.com/connect/thread.jspa?messageID=125197𞤍 -Mike On Thu, J

Re: Cassandra in the cloud

2010-06-03 Thread Ben Standefer
The commit log and data directory are on the same mounted directory structure (the 2 RAID 0 striped ephemeral disks) rather than using 1 of the ephemeral disks for the data and 1 of the ephemeral disks for the data directory. While it's usually advised that for disk utilization reasons you keep th

Re: Cassandra in the cloud

2010-06-03 Thread Mike Subelsky
Ben, do you just keep the commit log on the ephemeral drive? Or data and commit? (I was confused by your reference to XFS and snapshots -- I assume you keep data on the XFS drive) -Mike On Thu, Jun 3, 2010 at 2:29 PM, Ben Standefer wrote: > We're using Cassandra on AWS at SimpleGeo.  We softwa

Re: Cassandra in the cloud

2010-06-03 Thread Ben Standefer
We're using Cassandra on AWS at SimpleGeo. We software RAID 0 stripe the ephemeral drives to achieve better I/O and have machines in multiple Availability Zones with a custom EndPointSnitch that replicates the data between AZs for high availability (to be open-sourced/contributed at some point).

Re: Cassandra in the cloud

2010-06-03 Thread Eric Evans
On Thu, 2010-06-03 at 11:29 +0300, David Boxenhorn wrote: > We want to try out Cassandra in the cloud. Any recommendations? > Comments? > > Should we use Amazon? Rackspace? Something else? I personally haven't used Cassandra on EC2, but others have reported significantly better disk IO, (and hen

Re: Cassandra in the cloud

2010-06-03 Thread David King
> We want to try out Cassandra in the cloud. Any recommendations? Comments? > Should we use Amazon? Rackspace? Something else? I'm using it on Amazon with mostly success. I'd recommend increasing Phi from 8 to 10, use the 4-core/15gb instances to start, and if you plan to be really heavy on rea