Thanks for answers. It went quite well. Note what Aaron writes about sstable names, as I did the job before his mail, and changed one name wrong :-) - and that caused some troubles ( a lot of missing file errors )- i think that was to blame for some counter cf being messed up. As it was not important we didnt try from scratch again.
Vegard Berget aaron morton <aa...@thelastpickle.com>: >Sounds about right, i've done similar things before. > >Some notes… > >* I would make sure repair has completed on the source cluster before making >changes. I just like to know data is distributed. I would also do it once all >the moves are done. > >* Rather than flush, take a snap shot and copy from that. Then you will have a >stable set of files and it's easier to go back and see what you copied. >(Snapshot does a flush) > >* Take a second snapshot after you stop writing to the original cluster and >work out the delta between them. New files in the second snapshot are the ones >to copy. > >>> Both nodes are 1.1.6, but it might be that we upgrade the target to 1.1.7, >>> as I can't see that this will cause any problems? >I would always do one thing at a time. Upgrade before or after the move, not >in the middle of it. > >>> 1) It's the same number of nodes on both clusters, but does the tokens need >>> to be the same aswell? (Wouldn't a repair correct that later?) >I *think* you are moving from nodes in one cluster to nodes in a different >cluster (i.e. not adding a "data centre" to an existing cluster). In which >case it does not matter too much but I would keep them the same. > >>> 2) Could data files have any name? Could we, to avoid a filename crash, >>> just substitute the numbers with for example XXX in the data-files? >The names have to match the expected patterns. > >It may be easier to rename the files in your first copy, not the second delta >copy. Bump the file numbers enough that all the files in the delta copy do not >need to be renamed. > >>> 3) Is this really a sane way to do things? >If you are moving data from one set of nodes in a cassandra cluster to another >set of nodes in another cluster this is reasonable. You could add the new >nodes as a new DC and do the whole thing without down time but you mentioned >that was not possible. > >It looks like you are going to have some down time, or can accept some down >time, so here's a tweak. You should be able to get the delta copy part done >pretty quickly. If that's the case you can: > >1) do the main copy >2) stop the old system. >3) do the delta copy >4) start the new system > >That way you will not have stale reads in the new system. > >Hope that helps. > >----------------- >Aaron Morton >Freelance Cassandra Developer >New Zealand > >@aaronmorton >http://www.thelastpickle.com > >On 20/12/2012, at 5:08 PM, B. Todd Burruss <bto...@gmail.com> wrote: > >> to get it "correct", meaning consistent, it seems you will need to do >> a repair no matter what since the source cluster is taking writes >> during this time and writing to commit log. so to avoid filename >> issues just do the first copy and then repair. i am not sure if they >> can have any filename. >> >> to the question about whether the tokens must be the same, the answer >> is they can't be. >> (http://www.datastax.com/docs/datastax_enterprise2.0/multi_dc_install). >> i believe that as long as your replication factor is > 1, then using >> repair would fix most any token assignment >> >> On Wed, Dec 19, 2012 at 4:27 AM, Vegard Berget <p...@fantasista.no> wrote: >>> Hi, >>> >>> I know this have been a topic here before, but I need some input on how to >>> move data from one datacenter to another (and google just gives me some old >>> mails) - and at the same time moving "production" writing the same way. >>> To add the target cluster into the source cluster and just replicate data >>> before moving source nodes is not an option, but my plan is as follows: >>> 1) Flush data on source cluster and move all data/-files to the destination >>> cluster. While this is going on, we are still writing to the source >>> cluster. >>> 2) When data is copied, start cassandra on the new cluster - and then move >>> writing/reading to the new cluster. >>> 3) Now, do a new flush on the source cluster. As I understand, the sstable >>> files are immutable, so the _newly added_ data/ files could be moved to the >>> target cluster. >>> 4) After new data is also copied into the the target data/, do a nodetool >>> -refresh to load the new sstables into the system (i know we need to take >>> care of filenames). >>> >>> It's worth noting that none of the data is critical, but it would be nice to >>> get it correct. I know that there will be a short period between 2 and 4 >>> that reads potentially could read old data (written while copying, reading >>> after we have moved read/write). This is ok in this case. Our second >>> alternative is to: >>> >>> 1) Drain old cluster >>> 2) Copy to new cluster >>> 3) Start new cluster >>> >>> This will cause the cluster to be unavailable for writes in the copy-period, >>> and I wish to avoid that (even if that, too, is survivable). >>> >>> Both nodes are 1.1.6, but it might be that we upgrade the target to 1.1.7, >>> as I can't see that this will cause any problems? >>> >>> Questions: >>> >>> 1) It's the same number of nodes on both clusters, but does the tokens need >>> to be the same aswell? (Wouldn't a repair correct that later?) >>> >>> 2) Could data files have any name? Could we, to avoid a filename crash, >>> just substitute the numbers with for example XXX in the data-files? >>> >>> 3) Is this really a sane way to do things? >>> >>> Suggestions are most welcome! >>> >>> Regards >>> Vegard Berget >>> >>> >