I am also interested in seeing how the performance of Cassandra performs on
various virtual platforms.


On Wed, Jul 7, 2010 at 2:15 PM, Andrew Rollins <and...@localytics.com>wrote:

> On Wed, Jul 7, 2010 at 2:27 AM, Michael Dürgner <m...@duergner.de> wrote:
>
>> Have you done some testing with small nodes already? Because from what we
>> saw trying to run IO bound services on small instances is, that their IO
>> performance is really bad compared to other instance types as you can read
>> in several blogs.
>>
>> Would be interesting to hear, if a Cassandra cluster can handle that.
>>
>
> I have actually.
>
> I tested on 10 small nodes on Amazon EC2, each with 1 EBS disk. I've been
> avoiding large nodes for now since they are 4x the cost of a small, and 10
> small would translate to 2.5 large nodes. We figured it's better to slice
> things into more nodes, since 2 or 3 nodes would mean large chunks of data
> would need to be moved if a node failed.
>
> Under pure write loads with a fairly default config and 3x replication, we
> achieved 1,000 writes per second and probably could have pushed it a little
> bit more (perhaps to 2k per second). Write speed barely slowed even as we
> pushed past 50 million keys. Keys were 255 bytes with a single column
> containing 768 bytes.
>
> Things got much worse when we introduced reads, however. We did a 50/50
> read write split. IO went up, and nodes failed a couple hours into the test
> with out of memory errors. My theory is that the reads caused much more IO,
> which caused writes to get backed up in memory.
>
> I've had success in the past with RAID striping on EBS volumes. I was able
> to get nearly 4x improvement on a small instance with MySQL, so my next
> thing would be to try RAID with Cassandra.
>
> Also, another theory is that CommitLogSync in batch mode might allow me to
> effectively rate limit writing so that I don't overflow memory.
>
> Thoughts?
>
> - Andrew
>



-- 
-Richard L. Burton III

Reply via email to