What do you mean by "running live"? I am also planning to use cassandra on EC2 using small nodes. Small nodes have 1/4 cpu of the large ones, 1/4 cost, but I/O is more than 1/4 (amazon does not give explicit I/O numbers...), so I think 4 small instances should perform better than 1 large one (and the cost is the same), am I wrong?
El 27 de septiembre de 2010 18:09:14 UTC+2, Jonathan Ellis < jbel...@gmail.com> escribió: > I strongly recommend not running live on Small nodes. So in your case > I would recommend starting up Large instances with raid0'd disks, shut > down cassandra on the Small ones, rsync to the Large, and start up on > Large. > > On Mon, Sep 27, 2010 at 6:46 AM, Utku Can Topçu <u...@topcu.gen.tr> wrote: > > Hi All, > > > > We're currently running a cassandra cluster with Replication Factor 3, > > consisting of 4 nodes. > > > > The current situation is: > > > > - The nodes are all identical (AWS small instances) > > - Data directory is in the partition (/mnt) which has 150G capacity and > each > > node has around 90 GB load, so 60 G free space per node is left. > > > > So adding a new node to the cluster will seem to cause problems for us. I > > think the node which will stream the data to the new bootstrapping node, > > will not have enough disk space for anticompacting its data. > > > > What should be the best practice for such scenarios? > > > > Regards, > > > > Utku > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com >