The commit log and data directory are on the same mounted directory structure (the 2 RAID 0 striped ephemeral disks) rather than using 1 of the ephemeral disks for the data and 1 of the ephemeral disks for the data directory. While it's usually advised that for disk utilization reasons you keep the commit logs and data directory on separate disks, our RAID0 configuration gives us much more space for the data directory without having to mess with EBSes. We've found it to be fine for now.
I see how my XFS snapshots reference was confusing. Our plan is to have a single AZ use EBSes for the data directory so that we can more easily snapshot our data (trusting that our AZ-aware EndPointSnitch), while other AZs will continue ephemeral drives. -Ben Standefer On Thu, Jun 3, 2010 at 1:26 PM, Mike Subelsky <m...@subelsky.com> wrote: > Ben, > > do you just keep the commit log on the ephemeral drive? Or data and > commit? (I was confused by your reference to XFS and snapshots -- I > assume you keep data on the XFS drive) > > -Mike > > On Thu, Jun 3, 2010 at 2:29 PM, Ben Standefer <b...@simplegeo.com> wrote: >> We're using Cassandra on AWS at SimpleGeo. We software RAID 0 stripe >> the ephemeral drives to achieve better I/O and have machines in >> multiple Availability Zones with a custom EndPointSnitch that >> replicates the data between AZs for high availability (to be >> open-sourced/contributed at some point). >> >> Using XFS as described here >> http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1663 >> also makes it very easy to snapshot your cluster to S3. >> >> We've had no real problems with EC2 and Cassandra, it's been great. >> >> -Ben Standefer >> >> >> On Thu, Jun 3, 2010 at 11:51 AM, Eric Evans <eev...@rackspace.com> wrote: >>> On Thu, 2010-06-03 at 11:29 +0300, David Boxenhorn wrote: >>>> We want to try out Cassandra in the cloud. Any recommendations? >>>> Comments? >>>> >>>> Should we use Amazon? Rackspace? Something else? >>> >>> I personally haven't used Cassandra on EC2, but others have reported >>> significantly better disk IO, (and hence, better performance), with >>> Rackspace's Cloud Servers. >>> >>> Full disclosure though, I work for Rackspace. :) >>> >>> -- >>> Eric Evans >>> eev...@rackspace.com >>> >>> >> > > > > -- > Mike Subelsky > oib.com // ignitebaltimore.com // subelsky.com > @subelsky // (410) 929-4022 >