Re: Moving data from one datacenter to another

Vegard Berget Fri, 21 Dec 2012 09:09:16 -0800

Thanks for answers. It went quite well. Note what Aaron writes about sstable 
names, as I did the job before his mail, and changed one name wrong :-) - and 
that caused some troubles ( a lot of missing file errors )- i think that was to 
blame for some counter cf being messed up.  As it was not important we didnt 
try from scratch again.


Vegard Berget


aaron morton <aa...@thelastpickle.com>:

>Sounds about right, i've done similar things before. 
>
>Some notes…
>
>* I would make sure repair has completed on the source cluster before making 
>changes. I just like to know data is distributed. I would also do it once all 
>the moves are done.
>
>* Rather than flush, take a snap shot and copy from that. Then you will have a 
>stable set of files and it's easier to go back and see what you copied. 
>(Snapshot does a flush) 
> 
>* Take a second snapshot after you stop writing to the original cluster and 
>work out the delta between them. New files in the second snapshot are the ones 
>to copy. 
>
>>> Both nodes are 1.1.6, but it might be that we upgrade the target to 1.1.7,
>>> as I can't see that this will cause any problems?
>I would always do one thing at a time. Upgrade before or after the move, not 
>in the middle of it. 
>
>>> 1)  It's the same number of nodes on both clusters, but does the tokens need
>>> to be the same aswell?  (Wouldn't a repair correct that later?)
>I *think* you are moving from nodes in one cluster to nodes in a different 
>cluster (i.e. not adding a "data centre" to an existing cluster). In which 
>case it does not matter too much but I would keep them the same. 
>
>>> 2)  Could data files have any name?  Could we, to avoid a filename crash,
>>> just substitute the numbers with for example XXX in the data-files?
>The names have to match the expected patterns. 
>
>It may be easier to rename the files in your first copy, not the second delta 
>copy. Bump the file numbers enough that all the files in the delta copy do not 
>need to be renamed. 
>
>>> 3)  Is this really a sane way to do things?
>If you are moving data from one set of nodes in a cassandra cluster to another 
>set of nodes in another cluster this is reasonable. You could add the new 
>nodes as a new DC and do the whole thing without down time but you mentioned 
>that was not possible. 
>
>It looks like you are going to have some down time, or can accept some down 
>time, so here's a tweak. You should be able to get the delta copy part done 
>pretty quickly. If that's the case you can:
>
>1) do the main copy
>2) stop the old system.
>3) do the delta copy
>4) start the new system
>
>That way you will not have stale reads in the new system.
> 
>Hope that helps. 
>
>-----------------
>Aaron Morton
>Freelance Cassandra Developer
>New Zealand
>
>@aaronmorton
>http://www.thelastpickle.com
>
>On 20/12/2012, at 5:08 PM, B. Todd Burruss <bto...@gmail.com> wrote:
>
>> to get it "correct", meaning consistent, it seems you will need to do
>> a repair no matter what since the source cluster is taking writes
>> during this time and writing to commit log.  so to avoid filename
>> issues just do the first copy and then repair.  i am not sure if they
>> can have any filename.
>> 
>> to the question about whether the tokens must be the same, the answer
>> is they can't be.
>> (http://www.datastax.com/docs/datastax_enterprise2.0/multi_dc_install).
>> i believe that as long as your replication factor is > 1, then using
>> repair would fix most any token assignment
>> 
>> On Wed, Dec 19, 2012 at 4:27 AM, Vegard  Berget <p...@fantasista.no> wrote:
>>> Hi,
>>> 
>>> I know this have been a topic here before, but I need some input on how to
>>> move data from one datacenter to another (and google just gives me some old
>>> mails) - and at the same time moving "production" writing the same way.
>>> To add the target cluster into the source cluster and just replicate data
>>> before moving source nodes is not an option, but my plan is as follows:
>>> 1)  Flush data on source cluster and move all data/-files to the destination
>>> cluster.  While this is going on, we are still writing to the source
>>> cluster.
>>> 2)  When data is copied, start cassandra on the new cluster - and then move
>>> writing/reading to the new cluster.
>>> 3)  Now, do a new flush on the source cluster.  As I understand, the sstable
>>> files are immutable, so the _newly added_ data/ files could be moved to the
>>> target cluster.
>>> 4)  After new data is also copied into the the target data/, do a nodetool
>>> -refresh to load the new sstables into the system (i know we need to take
>>> care of filenames).
>>> 
>>> It's worth noting that none of the data is critical, but it would be nice to
>>> get it correct.  I know that there will be a short period between 2 and 4
>>> that reads potentially could read old data (written while copying, reading
>>> after we have moved read/write).  This is ok in this case.  Our second
>>> alternative is to:
>>> 
>>> 1) Drain old cluster
>>> 2) Copy to new cluster
>>> 3) Start new cluster
>>> 
>>> This will cause the cluster to be unavailable for writes in the copy-period,
>>> and I wish to avoid that (even if that, too, is survivable).
>>> 
>>> Both nodes are 1.1.6, but it might be that we upgrade the target to 1.1.7,
>>> as I can't see that this will cause any problems?
>>> 
>>> Questions:
>>> 
>>> 1)  It's the same number of nodes on both clusters, but does the tokens need
>>> to be the same aswell?  (Wouldn't a repair correct that later?)
>>> 
>>> 2)  Could data files have any name?  Could we, to avoid a filename crash,
>>> just substitute the numbers with for example XXX in the data-files?
>>> 
>>> 3)  Is this really a sane way to do things?
>>> 
>>> Suggestions are most welcome!
>>> 
>>> Regards
>>> Vegard Berget
>>> 
>>> 
>

Re: Moving data from one datacenter to another

Reply via email to