Hi guys,

We ended up prototyping DynamoDB for a variety of reasons, so I didn't get to 
test this during the workweek, but it's still interesting as I use Riak for my 
personal projects (although not YZ yet).

So here is the rough process I followed (mainly from 
http://docs.basho.com/riak/latest/ops/running/nodes/renaming/#Clusters-from-Backups):

1. Stop Riak process on all nodes (on freshly restored nodes, Riak refuses to 
start)
2. Restore Riak nodes (via any filesystem-level backup)
3. Change the Riak nodename in all configs
4. ReIP a single node
5. Start riak on the reip'd node
6. Bring "down" all nodes that will be replaced

So far the YZ directories are intact on all nodes, so this is where the fun 
starts.  Fetching a YZ doc count returns an error right now (since there is 
only one node up).

7. For each remaining node, remove the /var/lib/riak/ring directory (but not on 
the running node)
8. Start all remaining nodes

After this, we have 5 running Riak nodes, all with full YZ indexes still.  Only 
one node knows it's supposed to be in a cluster, the other 4 are standalone 
nodes right now.

9. Join the nodes to the cluster.

This is where the YZ indexes are removed.  This isn't immediate, and takes a 
few minutes to take effect, but this happens before ANY node is force-replaced, 
and also before the cluster plan is commited.
Another interesting behaviour pops up here.  If you run 'riak-admin cluster 
clear' on the "primary" node, the cluster status on that node updates, but all 
the other nodes still see the full cluster plan.  For a while at least, until 
all the other nodes eventually realise and crash.  If someone wants more detail 
on this crash, let me know what I can include.

10. So with the YZ indexes gone, let's heal the cluster.  force-replace the 
nodes as applicable and commit the new cluster plan.

So we now have a working cluster, with some of the original YZ documents (from 
the starting node).  Querying for all documents for me returns 35-39 million 
docs (depends on how many vnodes hit that working YZ node I think).  So let's 
try to get things restored properly.

11. Stop a node.
12. Replace it's /var/lib/yz and yz_anti_entropy directories from the backup.
13. Start the node. (49-53 million docs)
14. Repeat 11-13 for node 3. (73-39 million docs)
15. Repeat 11-13 for node 4. (97-103 million docs)
16. Repeat 11-13 for node 5. (126 million docs +- ~40,000 docs)

So overall, it works really well with some extra hackery.  The docs mentioned 
at the start can be followed fairly closely.  Before step 4 of "Bringing up the 
remaining nodes", the yz directory needs to be moved out of the way, or Riak 
will nuke it on cluster join.  After the cluster is fully reformed and stable, 
adding the yz directories back in seems to work fine (on stopped nodes at 
least), and I would imagine AAE would repair anything that's still inconsistent.

Hope that helps someone else, or gets the docs updated a bit.  It's a fairly 
big gotcha that if you don't do things exactly the YZ indexes get dropped, 
maybe Riak could do this on it's own?  Maybe during a cluster join, it could 
stop YZ and move the directory to a backup location, and during the cluster 
join being commited, delete it if it isn't needed (eg. not a force-replaced 
node).

Either way, it's not too hard to do in a shell script or any type of 
configuration management, and only affects restores.  The backup procedure of 
stop riak node, filesystem snapshot, start riak node works fine.

I still have the cluster available if there is anything else you want me to 
test.

Jason Campbell


> On 1 May 2015, at 22:51, Matthew Brender <mbren...@basho.com> wrote:
> 
> Hi Jason, 
> 
> Did you and Zeeshan have time to follow up on your experiments? I'm curious 
> to hear the conclusion. Please reply to the riak-user thread so others can 
> learn as well! 
> 
> Best, 
> Matt
> 
> Matt Brender | Developer Advocacy Lead
> Basho Technologies
> t: @mjbrender
> 
> 
> On Fri, Apr 24, 2015 at 8:56 PM, Jason Campbell <xia...@xiaclo.net> wrote:
> This may be a case of force-replace vs replace vs reip.  I'm happy to see if 
> I can get new cluster from backup to keep the Solr indexes.
> 
> The disk backup was all of /var/lib/riak, so definitely included the YZ 
> indexes before the force-replace, and they were kept on the first node that 
> was changed with reip.  I stopped each node before the snapshot to ensure 
> consistency.  So I would expect the final restored cluster to be somewhere 
> between the first and last node snapshot in terms of data, and AAE to repair 
> things to a consistent state for that few minute gap.
> 
> I'll experiment with different methods of rebuilding the cluster on Monday 
> and see if I can get it to keep the Solr indexes.  Maybe moving the YZ 
> indexes out of the way during the force-replace, then stopping the node and 
> putting them back could help as well.  I'll let you know the results of the 
> experiments either way.
> 
> Thanks,
> Jason
> 
> > On 25 Apr 2015, at 09:25, Zeeshan Lakhani <zlakh...@basho.com> wrote:
> >
> > Hey Jason,
> >
> > Yeah, nodes can normally be joined without a cluster dropping its Solr 
> > Index and AAE normally rebuilds the missing KV bits.
> >
> > In the case of restoring from a backup and having missing data, we can only 
> > recommend a reindex (the indexes that have the issue) with aggressive AAE 
> > settings to speed things up. It can be pretty fast. Recreating indexes are 
> > cheap in Yokozuna, but are the `data/yz` directories missing from the nodes 
> > that were force-replaced? Unless someone else wants to chime in, I’ll 
> > gather more info on what occurred from the reip vs the force-replace.
> >
> > Zeeshan Lakhani
> > programmer |
> > software engineer at @basho |
> > org. member/founder of @papers_we_love | paperswelove.org
> > twitter => @zeeshanlakhani
> >
> >> On Apr 24, 2015, at 7:02 PM, Jason Campbell <xia...@xiaclo.net> wrote:
> >>
> >> Is there a way to do a restore without rebuilding these indexes though?  
> >> Obviously this could take a long time depending on the amount of indexed 
> >> data in the cluster.  It's a fairly big gotcha to say that Yokozuna fixes 
> >> a lot of the data access issues that Riak has, but if you restore from a 
> >> backup, it could be useless for days or weeks.
> >>
> >> As far as disk consistency, the nodes were stopped during the snapshot, so 
> >> I'm assuming on-disk it would be consistent within a single node.  And 
> >> cluster wide, I would expect the overall data to fall somewhere between 
> >> the first and last node snapshot.  AAE should still repair the bits left 
> >> over, but it shouldn't have to rebuild the entire Solr index.
> >>
> >> So the heart of the question can I join a node to a cluster without 
> >> dropping it's Solr index?  force-replace obviously doesn't work, what is 
> >> the harm in running reip on every node instead of just the first?
> >>
> >> Thanks for the help,
> >> Jason
> >>
> >>> On 25 Apr 2015, at 00:36, Zeeshan Lakhani <zlakh...@basho.com> wrote:
> >>>
> >>> Hey Jason,
> >>>
> >>> Here’s a little more discussion on Yokozuna backup strategies: 
> >>> http://lists.basho.com/pipermail/riak-users_lists.basho.com/2014-January/014514.html.
> >>>
> >>> Nonetheless, I wouldn’t say the behavior’s expected, but we’re going to 
> >>> be adding more to the docs on how to rebuild indexes.
> >>>
> >>> To do so, you could just remove the yz_anti_entropy directory, and make 
> >>> AAE more aggressive, via
> >>>
> >>> ```
> >>> rpc:multicall([node() | nodes()], application, set_env, [yokozuna, 
> >>> anti_entropy_build_limit, {100, 1000}]).
> >>> rpc:multicall([node() | nodes()], application, set_env, [yokozuna, 
> >>> anti_entropy_concurrency, 4])
> >>> ```
> >>>
> >>> and the indexes will rebuild. You can try to initialize the building of 
> >>> trees with `yz_entropy_mgr:init([])` via `riak attach`, but a restart 
> >>> would also kick AAE into gear. There’s a bit more related info on this 
> >>> thread: 
> >>> http://lists.basho.com/pipermail/riak-users_lists.basho.com/2015-March/016929.html.
> >>>
> >>> Thanks.
> >>>
> >>> Zeeshan Lakhani
> >>> programmer |
> >>> software engineer at @basho |
> >>> org. member/founder of @papers_we_love | paperswelove.org
> >>> twitter => @zeeshanlakhani
> >>>
> >>>> On Apr 24, 2015, at 1:34 AM, Jason Campbell <xia...@xiaclo.net> wrote:
> >>>>
> >>>> I think I figured it out.
> >>>>
> >>>> I followed this guide: 
> >>>> http://docs.basho.com/riak/latest/ops/running/nodes/renaming/#Clusters-from-Backups
> >>>>
> >>>> The first Riak node (changed with riak-admin reip) kept it's Solr index. 
> >>>>  However, the other nodes when joined via riak-admin cluster 
> >>>> force-replace, dropped their Solr indexes.
> >>>>
> >>>> Is this expected?  If so, it should really be in the docs, and there 
> >>>> should be another way to restore a cluster keeping Solr intact.
> >>>>
> >>>> Also, is there a way to rebuild a Solr index?
> >>>>
> >>>> Thanks,
> >>>> Jason
> >>>>
> >>>>> On 24 Apr 2015, at 15:16, Jason Campbell <xia...@xiaclo.net> wrote:
> >>>>>
> >>>>> I've just done a backup and restore of our production Riak cluster, and 
> >>>>> Yokozuna has dropped from around 125 million records to 25million.  
> >>>>> Obviously the IPs have changed, and although the Riak cluster is 
> >>>>> stable, I'm not sure Solr handled the transition as nicely.
> >>>>>
> >>>>> Is there a way to force Solr to rebuild the indexes, or at least get 
> >>>>> back to the state it was in before the backup?
> >>>>>
> >>>>> Also, is this expected behaviour?
> >>>>>
> >>>>> Thanks,
> >>>>> Jason
> >>>>> _______________________________________________
> >>>>> riak-users mailing list
> >>>>> riak-users@lists.basho.com
> >>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> riak-users mailing list
> >>>> riak-users@lists.basho.com
> >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>>
> >>> _______________________________________________
> >>> riak-users mailing list
> >>> riak-users@lists.basho.com
> >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >>
> >
> > _______________________________________________
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to