Thank you for the detailed reply Rob! I have replied to your comments in-line below;
On Nov 14, 2013, at 1:15 PM, Robert Coli <rc...@eventbrite.com> wrote: > On Thu, Nov 14, 2013 at 12:37 PM, David Laube <d...@stormpath.com> wrote: > It is almost as if the data only exists on some of the nodes, or perhaps the > token ranges are dramatically different --again, we are using vnodes so I am > not exactly sure how this plays into the equation. > > The token ranges are dramatically different, due to vnode random token > selection from not setting initial_token, and setting num_tokens. > > You can verify this by listing the tokens per physical node in nodetool > gossipinfo or (iirc) nodetool status. > > 5. Copy 1 of the 5 snapshot archives from cluster-A to each of the five nodes > in the new cluster-B ring. > > I don't understand this at all, do you mean that you are using one source > node's data to load each of of the target nodes? Or are you just saying > there's a 1:1 relationship between source snapshots and target nodes to load > into? Unless you have RF=N, using one source for 5 target nodes won't work. We have configured RF=3 for the keyspace in question. Also, from a client perspective, we read with CL=1 and write with CL=QUORUM. Since we have 5 nodes total in cluster-A, we snapshot keyspace_name on each of the five nodes which results in a snapshot directory on each of the five nodes that we archive and ship off to s3. We then take the snapshot archive generated FROM cluster-A_node1 and copy/extract/restore TO cluster-B_node1, then we take the snapshot archive FROM cluster-A_node2 and copy/extract/restore TO cluster-B_node2 and so on and so forth. > > To do what I think you're attempting to do, you have basically two options. > > 1) don't use vnodes and do a 1:1 copy of snapshots > 2) use vnodes and > a) get a list of tokens per node from the source cluster > b) put a comma delimited list of these in initial_token in cassandra.yaml > on target nodes > c) probably have to un-set num_tokens (this part is unclear to me, you > will have to test..) > d) set auto_bootstrap:false in cassandra.yaml > e) start target nodes, they will not-bootstrap into the same ranges as the > source cluster > f) load schema / copy data into datadir (being careful of > https://issues.apache.org/jira/browse/CASSANDRA-6245) > g) restart node or use nodetool refresh (I'd probably restart the node to > avoid the bulk rename that refresh does) to pick up sstables > h) remove auto_bootstrap:false from cassandra.yaml > > I *believe* this *should* work, but have never tried it as I do not currently > run with vnodes. It should work because it basically makes implicit vnode > tokens explicit in the conf file. If it *does* work, I'd greatly appreciate > you sharing details of your experience with the list. I'll start with parsing out the token ranges that our vnode config ends up assigning in cluster-A, and doing some creative config work on the target cluster-B we are trying to restore to as you have suggested. Depending on what additional comments/recommendation you or another member of the list may have (if any) based on the clarification I've made above, I will absolutely report back my findings here. > > General reference on tasks of this nature (does not consider vnodes, but > treat vnodes as "just a lot of physical nodes" and it is mostly relevant) : > http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra > > =Rob