Sounds great, will give it a go. However, just to make sure I understand getting the keyspace correct.

Lets say I've got:
    A -- Node before overfull node in keyspace order
    O -- Overfull node
    B -- Node after O in keyspace order
    N -- New empty node

I'm going to assume that I should make the following assignment:
    keyspace(N) = keyspace(A) + ( keyspace(O) - keyspace(A) ) / 2

Or did I miss something else about keyspace ranges?
Thanks


On 5/7/10 1:25 PM, Jonathan Ellis wrote:
If you're using RackUnawareStrategy (the default replication strategy)
then you can "bootstrap" manually fairly easily -- copy all the data
(not system) sstables from an overfull machine to a new machine,
assign the new one a token that gives it about half of the old node's
range, then start it with autobootstrap OFF.  Then run cleanup on both
new and old nodes to remove the part of the data that belongs to the
other.

The downside vs real bootstrap is you can't do this safely while
writes are coming in to the original node.  You can reduce your
read-only period by doing an intial scp, then doing a flush + rsync
when you're ready to take it read only.

(https://issues.apache.org/jira/browse/CASSANDRA-579 will make this
problem obsolete for 0.7 but that doesn't help you on 0.6, of course.)

On Fri, May 7, 2010 at 2:08 PM, David Koblas<kob...@extra.com>  wrote:
I've got two (out of five) nodes on my cassandra ring that somehow got too
full (e.g. over 60% disk space utilization).  I've now gotten a few new
machines added to the ring, but evertime one of the overfull nodes attempts
to stream its data it runs out of diskspace...  I've tried half a dozen
different bad ideas of how to get things moving along a bit smoother, but am
at a total loss at this point.

Is there any good tricks to get cassandra to not need 2x the disk space to
stream out, or is something else potentially going on that's causing me
problems?

Thanks,



Reply via email to