Observation on shuffling vs adding/removing nodes

Andrew Bialecki Sat, 23 Mar 2013 13:41:49 -0700

Just curious if anyone has any thoughts on something we've observed in a
small test cluster.


We had around 100 GB of data on a 3 node cluster (RF=2) and wanted to start
using vnodes. We upgraded the cluster to 1.2.2 and then followed the
instructions for using vnodes. We initially tried to run a shuffle, however
it seemed to be going really slowly (very little progress by watching
"cassandra-shuffle ls | wc -l" after 5-6 hours and no errors in logs), so
we cancelled it and instead added 3 nodes to the cluster, waited for them
to bootstrap, and then decommissioned the first 3 nodes. Total process took
about 3 hours. My assumption is that the final result is the same in terms
of data distributed somewhat randomly across nodes now (assuming no bias in
the token ranges selected when bootstrapping a node).

If that assumption is correct, the observation would be, if possible,
adding nodes and then removing nodes appears to be a faster way to shuffle
data for small clusters. Obviously not always possible, but I thought I'd
just throw this out there in case anyone runs into a similar situation.
This cluster is unsurprisingly on EC2 instances, which made provisioning
and shutting down nodes extremely easy.

Cheers,
Andrew

Observation on shuffling vs adding/removing nodes

Reply via email to