Hi, I am looking for an efficient way migrate a portion of the data existing in a Cassandra cluster to another, separate Cassandra cluster. What I need is to solve the typical live migration problem that appears in any "DB sharding" where need to transfer "ownership" of certain rows from DB1 to DB2...but in a way that clients see no (or almost no) disruption when you actually do the cutover to DB2 for those writes.
I mean doing something as typical like: loop (until almost no rows have been modified): rows = SELECT * from T where "criteria matches (i.e., shard_id=1) " AND updated_at > last_time last_time = now insert(rows) elsewhere end ... "lock" modifications to original DB do one last SELECT to get the last few modified rows cutover the ownership - (change and ensure the clients know that the new home for that data is in the other "DB") unlock modifications So, anyway, I thought that I'd be able to apply the same principles by passing a timestamp of sorts to the get_slices call so I could further restrict getting only matching columns that have timestamps newer than the one passed. Now, looking at the thrift interface I see that there is no timestamp parameter at all...which makes me wonder how people are doing it, and if there are any well-know practices for it. Setting up a full new replicating DC within the same cluster doesn't work, as there are some clear cases where you want to have completely separate cassandra rings. Cheers, Josep M.