How can I check that AAE trees have expired? Yesterday I ran " riak_core_util:rpc_every_member_ann(yz_entropy_mgr, expire_trees, [], 5000)." on each node (just to be sure). Still today I see that on 3 nodes (of 5) all entropy tress and all last AAE exchanges are older than 20 days.
On 4 April 2016 at 17:15, Oleksiy Krivoshey <oleks...@gmail.com> wrote: > Continuation... > > The new index has the same inconsistent search results problem. > I was making a snapshot of `search aae-status` command almost each day. > There were absolutely no Yokozuna errors in logs. > > I can see that some AAE trees were not expired (built > 20 days ago). I > can also see that on two nodes (of 5) last AAE exchanges happened > 20 days > ago. > > For now I have issued > ` riak_core_util:rpc_every_member_ann(yz_entropy_mgr, expire_trees, [], > 5000).` on each node again. I will wait 10 days more but I don't think that > will fix anything. > > > On 25 March 2016 at 09:28, Oleksiy Krivoshey <oleks...@gmail.com> wrote: > >> One interesting moment happened when I tried removing the index: >> >> - this index was associated with a bucket type, called fs_chunks >> - so I first called RpbSetBucketTypeReq to set search_index: _dont_index_ >> - i then tried to remove the index with RpbYokozunaIndexDeleteReq which >> failed with "index is in use" and list of all buckets of the fs_chunks type >> - for some reason all these buckets had their own search_index property >> set to that same index >> >> How can this happen if I definitely never set the search_index property >> per bucket? >> >> On 24 March 2016 at 22:41, Oleksiy Krivoshey <oleks...@gmail.com> wrote: >> >>> OK! >>> >>> On 24 March 2016 at 21:11, Magnus Kessler <mkess...@basho.com> wrote: >>> >>>> Hi Oleksiy, >>>> >>>> On 24 March 2016 at 14:55, Oleksiy Krivoshey <oleks...@gmail.com> >>>> wrote: >>>> >>>>> Hi Magnus, >>>>> >>>>> Thanks! I guess I will go with index deletion because I've already >>>>> tried expiring the trees before. >>>>> >>>>> Do I need to delete AAE data somehow or removing the index is enough? >>>>> >>>> >>>> If you expire the AAE trees with the commands I posted earlier, there >>>> should be no need to remove the AAE data directories manually. >>>> >>>> I hope this works for you. Please monitor the tree rebuild and >>>> exchanges with `riak-admin search aae-status` for the next few days. In >>>> particular the exchanges should be ongoing on a continuous basis once all >>>> trees have been rebuilt. If they don't, please let me know. At that point >>>> you should also gather `riak-debug` output from all nodes before it gets >>>> rotated out after 5 days by default. >>>> >>>> Kind Regards, >>>> >>>> Magnus >>>> >>>> >>>>> >>>>> On 24 March 2016 at 13:28, Magnus Kessler <mkess...@basho.com> wrote: >>>>> >>>>>> Hi Oleksiy, >>>>>> >>>>>> As a first step, I suggest to simply expire the Yokozuna AAE trees >>>>>> again if the output of `riak-admin search aae-status` still suggests that >>>>>> no recent exchanges have taken place. To do this, run `riak attach` on >>>>>> one >>>>>> node and then >>>>>> >>>>>> riak_core_util:rpc_every_member_ann(yz_entropy_mgr, expire_trees, [], >>>>>> 5000). >>>>>> >>>>>> >>>>>> Exit from the riak console with `Ctrl+G q`. >>>>>> >>>>>> Depending on your settings and amount of data the full index should >>>>>> be rebuilt within the next 2.5 days (for a cluster with ring size 128 and >>>>>> default settings). You can monitor the progress with `riak-admin search >>>>>> aae-status` and also in the logs, which should have messages along the >>>>>> lines of >>>>>> >>>>>> 2016-03-24 10:28:25.372 [info] >>>>>> <0.4647.6477>@yz_exchange_fsm:key_exchange:179 Repaired 83055 keys during >>>>>> active anti-entropy exchange of partition >>>>>> 1210306043414653979137426502093171875652569137152 for preflist >>>>>> {1164634117248063262943561351070788031288321245184,3} >>>>>> >>>>>> >>>>>> Re-indexing can put additional strain on the cluster and may cause >>>>>> elevated latency on a cluster already under heavy load. Please monitor >>>>>> the >>>>>> response times while the cluster is re-indexing data. >>>>>> >>>>>> If the cluster load allows it, you can force more rapid re-indexing >>>>>> by changing a few parameters. Again at the `riak attach` console, run >>>>>> >>>>>> riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, >>>>>> anti_entropy_build_limit, {4, 60000}], 5000). >>>>>> riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, >>>>>> anti_entropy_concurrency, 5], 5000). >>>>>> >>>>>> This will allow up to 4 trees per node to be built/exchanged per >>>>>> hour, with up to 5 concurrent exchanges throughout the cluster. To return >>>>>> back to the default settings, use >>>>>> >>>>>> riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, >>>>>> anti_entropy_build_limit, {1, 360000}], 5000). >>>>>> riak_core_util:rpc_every_member_ann(application, set_env, [yokozuna, >>>>>> anti_entropy_concurrency, 2], 5000). >>>>>> >>>>>> >>>>>> If the cluster still doesn't make any progress with automatically >>>>>> re-indexing data, the next steps are pretty much what you already >>>>>> suggested, to drop the existing index and re-index from scratch. I'm >>>>>> assuming that losing the indexes temporarily is acceptable to you at this >>>>>> point. >>>>>> >>>>>> Using any client API that supports RpbYokozunaIndexDeleteReq, you >>>>>> can drop the index from all Solr instances, losing any data stored there >>>>>> immediately. Next, you'll have to re-create the index. I have tried this >>>>>> with the python API, where I deleted the index and re-created it with the >>>>>> same already uploaded schema: >>>>>> >>>>>> from riak import RiakClient >>>>>> >>>>>> c = RiakClient() >>>>>> c.delete_search_index('my_index') >>>>>> c.create_search_index('my_index', 'my_schema') >>>>>> >>>>>> Note that simply deleting the index does not remove it's existing >>>>>> association with any bucket or bucket type. Any PUT operations on these >>>>>> buckets will lead to indexing failures being logged until the index has >>>>>> been recreated. However, this also means that no separate operation in >>>>>> `riak-admin` is required to associate the newly recreated index with the >>>>>> buckets again. >>>>>> >>>>>> After recreating the index expire the trees as explained previously. >>>>>> >>>>>> Let us know if this solves your issue. >>>>>> >>>>>> Kind Regards, >>>>>> >>>>>> Magnus >>>>>> >>>>>> >>>>>> On 24 March 2016 at 08:44, Oleksiy Krivoshey <oleks...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> This is how things are looking after two weeks: >>>>>>> >>>>>>> - there are no solr indexing issues for a long period (2 weeks) >>>>>>> - there are no yokozuna errors at all for 2 weeks >>>>>>> - there is an index with all empty schema, just _yz_* fields, >>>>>>> objects stored in a bucket(s) are binary and so are not analysed by >>>>>>> yokozuna >>>>>>> - same yokozuna query repeated gives different number for num_found, >>>>>>> typically the difference between real number of keys in a bucket and >>>>>>> num_found is about 25% >>>>>>> - number of keys repaired by AAE (according to logs) is about 1-2 >>>>>>> per few hours (number of keys "missing" in index is close to 1,000,000) >>>>>>> >>>>>>> Should I now try to delete the index and yokozuna AAE data and wait >>>>>>> another 2 weeks? If yes - how should I delete the index and AAE data? >>>>>>> Will RpbYokozunaIndexDeleteReq be enough? >>>>>>> >>>>>>> >>>>>>> >>>>>> -- >>>>>> Magnus Kessler >>>>>> Client Services Engineer >>>>>> Basho Technologies Limited >>>>>> >>>>>> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg >>>>>> 07970431 >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Magnus Kessler >>>> Client Services Engineer >>>> Basho Technologies Limited >>>> >>>> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431 >>>> >>> >>> >> >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com