Ben,

thanks for that, we may try that.  I did find an AWS forum tidbit from
two years ago:

"4 ephemeral stores striped together can give significantly higher
throughput for sequential writes than EBS."

http://developer.amazonwebservices.com/connect/thread.jspa?messageID=125197&#125197

-Mike

On Thu, Jun 3, 2010 at 5:57 PM, Ben Standefer <b...@simplegeo.com> wrote:
> The commit log and data directory are on the same mounted directory
> structure (the 2 RAID 0 striped ephemeral disks) rather than using 1
> of the ephemeral disks for the data and 1 of the ephemeral disks for
> the data directory.  While it's usually advised that for disk
> utilization reasons you keep the commit logs and data directory on
> separate disks, our RAID0 configuration gives us much more space for
> the data directory without having to mess with EBSes.  We've found it
> to be fine for now.
>
> I see how my XFS snapshots reference was confusing.  Our plan is to
> have a single AZ use EBSes for the data directory so that we can more
> easily snapshot our data (trusting that our AZ-aware EndPointSnitch),
> while other AZs will continue ephemeral drives.
>
> -Ben Standefer
>
>
> On Thu, Jun 3, 2010 at 1:26 PM, Mike Subelsky <m...@subelsky.com> wrote:
>> Ben,
>>
>> do you just keep the commit log on the ephemeral drive?  Or data and
>> commit? (I was confused by your reference to XFS and snapshots -- I
>> assume you keep data on the XFS drive)
>>
>> -Mike
>>
>> On Thu, Jun 3, 2010 at 2:29 PM, Ben Standefer <b...@simplegeo.com> wrote:
>>> We're using Cassandra on AWS at SimpleGeo.  We software RAID 0 stripe
>>> the ephemeral drives to achieve better I/O and have machines in
>>> multiple Availability Zones with a custom EndPointSnitch that
>>> replicates the data between AZs for high availability (to be
>>> open-sourced/contributed at some point).
>>>
>>> Using XFS as described here
>>> http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1663
>>> also makes it very easy to snapshot your cluster to S3.
>>>
>>> We've had no real problems with EC2 and Cassandra, it's been great.
>>>
>>> -Ben Standefer
>>>
>>>
>>> On Thu, Jun 3, 2010 at 11:51 AM, Eric Evans <eev...@rackspace.com> wrote:
>>>> On Thu, 2010-06-03 at 11:29 +0300, David Boxenhorn wrote:
>>>>> We want to try out Cassandra in the cloud. Any recommendations?
>>>>> Comments?
>>>>>
>>>>> Should we use Amazon? Rackspace? Something else?
>>>>
>>>> I personally haven't used Cassandra on EC2, but others have reported
>>>> significantly better disk IO, (and hence, better performance), with
>>>> Rackspace's Cloud Servers.
>>>>
>>>> Full disclosure though, I work for Rackspace. :)
>>>>
>>>> --
>>>> Eric Evans
>>>> eev...@rackspace.com
>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Mike Subelsky
>> oib.com // ignitebaltimore.com // subelsky.com
>> @subelsky // (410) 929-4022
>>
>



-- 
Mike Subelsky
oib.com // ignitebaltimore.com // subelsky.com
@subelsky

Reply via email to