We can't do a straight upgrade from 0.6.13 to 0.7.5 because we have rows
stored that have unicode keys, and Cassandra 0.7.5 thinks those rows in the
sstables are corrupt, and it seems impossible to clean it up without losing
data.

However, we can still read all rows perfectly via thrift so we are now
looking at building a simple tool that will copy all rows from our 0.6.3
cluster to a parallell 0.7.5 cluster. Our question is now how to do that and
ensure that we actually get all rows migrated? It's a pretty small cluster,
3 machines, a single keyspace, a singke columnfamily, ~2 million rows, a few
GB of data, and a replication factor of 3.

So what's the best way? Call get_range_slices and move through the entire
token space? We also have all row keys in a secondary system, would it be
better to use that and make calls to get_multi or get_multi_slices instead?
Are we correct in assuming that if we use the consistencylevel ALL we'll get
all rows?


/Henrik Schröder

Reply via email to