Storage size is not a problem, you always can add more nodes. Anyway, it is
not recommended to have nodes with more then 500G (compaction, repair take
forever). EC2 m1.large has 800G of ephemeral storage, EC2 m1.xlarge 1.6T.
I'd recommend xlarge, it has 4 CPUs, so maintenance procedures don't affect
performance a lot.

Andrey


On Wed, Jan 16, 2013 at 12:42 PM, Marcelo Elias Del Valle <
mvall...@gmail.com> wrote:

> Hello,
>
>    I am currently using hadoop + cassandra at amazon AWS. Cassandra runs
> on EC2 and my hadoop process runs at EMR. For cassandra storage, I am using
> local EC2 EBS disks.
>    My system is running fine for my tests, but to me it's not a good setup
> for production. I need my system to perform well for specially for writes
> on cassandra, but the amount of data could grow really big, taking several
> Tb of total storage.
>     My first guess was using S3 as a storage and I saw this can be done by
> using Cloudian package, but I wouldn't like to become dependent on a
> pre-package solution and I found it's kind of expensive for more than
> 100Tb: http://www.cloudian.com/pricing.html
>     I saw some discussion at internet about using EBS or ephemeral disks
> for storage at Amazon too.
>
>     My question is: does someone on this list have the same problem as me?
> What are you using as solution to Cassandra's storage when running it at
> Amazon AWS?
>
>     Any thoughts would be highly appreciatted.
>
> Best regards,
> --
> Marcelo Elias Del Valle
> http://mvalle.com - @mvallebr
>

Reply via email to