Ben, thanks for that, we may try that. I did find an AWS forum tidbit from two years ago:
"4 ephemeral stores striped together can give significantly higher throughput for sequential writes than EBS." http://developer.amazonwebservices.com/connect/thread.jspa?messageID=125197𞤍 -Mike On Thu, Jun 3, 2010 at 5:57 PM, Ben Standefer <b...@simplegeo.com> wrote: > The commit log and data directory are on the same mounted directory > structure (the 2 RAID 0 striped ephemeral disks) rather than using 1 > of the ephemeral disks for the data and 1 of the ephemeral disks for > the data directory. While it's usually advised that for disk > utilization reasons you keep the commit logs and data directory on > separate disks, our RAID0 configuration gives us much more space for > the data directory without having to mess with EBSes. We've found it > to be fine for now. > > I see how my XFS snapshots reference was confusing. Our plan is to > have a single AZ use EBSes for the data directory so that we can more > easily snapshot our data (trusting that our AZ-aware EndPointSnitch), > while other AZs will continue ephemeral drives. > > -Ben Standefer > > > On Thu, Jun 3, 2010 at 1:26 PM, Mike Subelsky <m...@subelsky.com> wrote: >> Ben, >> >> do you just keep the commit log on the ephemeral drive? Or data and >> commit? (I was confused by your reference to XFS and snapshots -- I >> assume you keep data on the XFS drive) >> >> -Mike >> >> On Thu, Jun 3, 2010 at 2:29 PM, Ben Standefer <b...@simplegeo.com> wrote: >>> We're using Cassandra on AWS at SimpleGeo. We software RAID 0 stripe >>> the ephemeral drives to achieve better I/O and have machines in >>> multiple Availability Zones with a custom EndPointSnitch that >>> replicates the data between AZs for high availability (to be >>> open-sourced/contributed at some point). >>> >>> Using XFS as described here >>> http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1663 >>> also makes it very easy to snapshot your cluster to S3. >>> >>> We've had no real problems with EC2 and Cassandra, it's been great. >>> >>> -Ben Standefer >>> >>> >>> On Thu, Jun 3, 2010 at 11:51 AM, Eric Evans <eev...@rackspace.com> wrote: >>>> On Thu, 2010-06-03 at 11:29 +0300, David Boxenhorn wrote: >>>>> We want to try out Cassandra in the cloud. Any recommendations? >>>>> Comments? >>>>> >>>>> Should we use Amazon? Rackspace? Something else? >>>> >>>> I personally haven't used Cassandra on EC2, but others have reported >>>> significantly better disk IO, (and hence, better performance), with >>>> Rackspace's Cloud Servers. >>>> >>>> Full disclosure though, I work for Rackspace. :) >>>> >>>> -- >>>> Eric Evans >>>> eev...@rackspace.com >>>> >>>> >>> >> >> >> >> -- >> Mike Subelsky >> oib.com // ignitebaltimore.com // subelsky.com >> @subelsky // (410) 929-4022 >> > -- Mike Subelsky oib.com // ignitebaltimore.com // subelsky.com @subelsky