I can think of at least 2 clusters running 32GB boxes with single
Cassandra processes on each.  (16 seems to be more common.)  At 64 I
would seriously consider multiple processes per machine.  You'd want
to configure a Snitch such that same-machine boxes were considered the
same rack, there is no separate closeness level of same machine.

At 32 I think you're fine with one process.  Watch for latency spikes
and see how it goes.

I would run raid 10 on the data disks if you can afford giving up the
space, otherwise raid0.  I don't know that anyone's tested raid5.

On Sun, May 23, 2010 at 3:30 PM, Aaron McCurry <amccu...@gmail.com> wrote:
> I am planning on setting up a Cassandra cluster on a small 16 node cluster
> (possibly 32 way).  Each machine has 8 cores 32 Gig of ram and 8 hds.  My
> first thought is to setup one of those hds for the commit log, 6 for data
> and leave one for the OS.  However I do have a concern about best utilizing
> my memory, should I run a larger heap?  Should I run several cassandra
> processes on the same box?
> My concern about the larger heap is because GC's typically get slower.  And
> if I run several procs, does cassandra realize that it's the same box for
> replication purposes?
> I do have other hd conf options, hardware RAID 0,1,or 5.
> Just looking for some general configuration options as well as some real
> world successes with similarly sized hardware.  Thanks!
> Aaron



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Reply via email to