If you're using RackUnawareStrategy (the default replication strategy)
then you can "bootstrap" manually fairly easily -- copy all the data
(not system) sstables from an overfull machine to a new machine,
assign the new one a token that gives it about half of the old node's
range, then start it with autobootstrap OFF.  Then run cleanup on both
new and old nodes to remove the part of the data that belongs to the
other.

The downside vs real bootstrap is you can't do this safely while
writes are coming in to the original node.  You can reduce your
read-only period by doing an intial scp, then doing a flush + rsync
when you're ready to take it read only.

(https://issues.apache.org/jira/browse/CASSANDRA-579 will make this
problem obsolete for 0.7 but that doesn't help you on 0.6, of course.)

On Fri, May 7, 2010 at 2:08 PM, David Koblas <kob...@extra.com> wrote:
> I've got two (out of five) nodes on my cassandra ring that somehow got too
> full (e.g. over 60% disk space utilization).  I've now gotten a few new
> machines added to the ring, but evertime one of the overfull nodes attempts
> to stream its data it runs out of diskspace...  I've tried half a dozen
> different bad ideas of how to get things moving along a bit smoother, but am
> at a total loss at this point.
>
> Is there any good tricks to get cassandra to not need 2x the disk space to
> stream out, or is something else potentially going on that's causing me
> problems?
>
> Thanks,
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Reply via email to