Hi Fred, This is production environment but I can delete the index. However this index covers ~3500 buckets and there are probably 10,000,000 keys.
The index was created after the buckets. The schema for the index is just the basic required fields (_yz_*) and nothing else. Yes, I'm willing to resolve this. When you say to delete chunks_index, do you mean the simple RpbYokozunaIndexDeleteReq or something else is required? Thanks! On 11 March 2016 at 17:08, Fred Dushin <fdus...@basho.com> wrote: > Hi Oleksiy, > > This is definitely pointing to an issue either in the coverage plan (which > determines the distributed query you are seeing) or in the data you have in > Solr. I am wondering if it is possible that you have some data in Solr > that is causing the rebuild of the YZ AAE tree to incorrectly represent > what is actually stored in Solr. > > What you did was to manually expire the YZ (Riak Search) AAE trees, which > caused them to rebuild from the entropy data stored in Solr. Another thing > we could try (if you are willing) would be to delete the 'chunks_index' > data in Solr (as well as the Yokozuna AAE data), and then let AAE repair > the missing data. What Riak will essentially do is compare the KV hash > trees with the YZ hash trees (which will be empty), too that it is missing > in Solr, and add it to Solr, as a result. This would effectively result in > re-indexing all of your data, but we are only talking about ~30k entries > (times 3, presumably, if your n_val is 3), so that shouldn't take too much > time, I wouldn't think. There is even some configuration you can use to > accelerate this process, if necessary. > > Is that something you would be willing to try? It would result in down > time on query. Is this production data or a test environment? > > -Fred > > On Mar 11, 2016, at 7:38 AM, Oleksiy Krivoshey <oleks...@gmail.com> wrote: > > Here are two consequent requests, one returns 30118 keys, another 37134 > > <?xml version="1.0" encoding="UTF-8"?> > <response> > <lst name="responseHeader"> > <int name="status">0</int> > <int name="QTime">6</int> > <lst name="params"> > <str name="10.0.1.3:8093">_yz_pn:92 OR _yz_pn:83 OR _yz_pn:71 OR > _yz_pn:59 OR _yz_pn:50 OR _yz_pn:38 OR _yz_pn:17 OR _yz_pn:5</str> > <str name="10.0.1.2:8093">_yz_pn:122 OR _yz_pn:110 OR _yz_pn:98 OR > _yz_pn:86 OR _yz_pn:74 OR _yz_pn:62 OR _yz_pn:26 OR _yz_pn:14 OR > _yz_pn:2</str> > <str name="shards"> > 10.0.1.1:8093/internal_solr/chunks_index,10.0.1.2:8093/internal_solr/chunks_index,10.0.1.3:8093/internal_solr/chunks_index,10.0.1.4:8093/internal_solr/chunks_index,10.0.1.5:8093/internal_solr/chunks_index > </str> > <str name="q">_yz_rb:0dmid2ilpyrfiuaqtvnc482f1esdchb5.chunks</str> > <str name="10.0.1.5:8093">(_yz_pn:124 AND (_yz_fpn:124 OR > _yz_fpn:123)) OR _yz_pn:116 OR _yz_pn:104 OR _yz_pn:80 OR _yz_pn:68 OR > _yz_pn:56 OR _yz_pn:44 OR _yz_pn:32 OR _yz_pn:20 OR _yz_pn:8</str> > <str name="10.0.1.1:8093">_yz_pn:113 OR _yz_pn:101 OR _yz_pn:89 OR > _yz_pn:77 OR _yz_pn:65 OR _yz_pn:53 OR _yz_pn:41 OR _yz_pn:29</str> > <str name="10.0.1.4:8093">_yz_pn:127 OR _yz_pn:119 OR _yz_pn:107 OR > _yz_pn:95 OR _yz_pn:47 OR _yz_pn:35 OR _yz_pn:23 OR _yz_pn:11</str> > <str name="rows">0</str> > </lst> > </lst> > <result maxScore="6.364349" name="response" numFound="30118" > start="0"></result> > </response> > > ------ > > > <?xml version="1.0" encoding="UTF-8"?> > <response> > <lst name="responseHeader"> > <int name="status">0</int> > <int name="QTime">10</int> > <lst name="params"> > <str name="10.0.1.3:8093">_yz_pn:100 OR _yz_pn:88 OR _yz_pn:79 OR > _yz_pn:67 OR _yz_pn:46 OR _yz_pn:34 OR _yz_pn:25 OR _yz_pn:13 OR > _yz_pn:1</str> > <str name="10.0.1.2:8093">(_yz_pn:126 AND (_yz_fpn:126 OR > _yz_fpn:125)) OR _yz_pn:118 OR _yz_pn:106 OR _yz_pn:94 OR _yz_pn:82 OR > _yz_pn:70 OR _yz_pn:58 OR _yz_pn:22 OR _yz_pn:10</str> > <str name="shards"> > 10.0.1.1:8093/internal_solr/chunks_index,10.0.1.2:8093/internal_solr/chunks_index,10.0.1.3:8093/internal_solr/chunks_index,10.0.1.4:8093/internal_solr/chunks_index,10.0.1.5:8093/internal_solr/chunks_index > </str> > <str name="q">_yz_rb:0dmid2ilpyrfiuaqtvnc482f1esdchb5.chunks</str> > <str name="10.0.1.5:8093">_yz_pn:124 OR _yz_pn:112 OR _yz_pn:76 OR > _yz_pn:64 OR _yz_pn:52 OR _yz_pn:40 OR _yz_pn:28 OR _yz_pn:16 OR > _yz_pn:4</str> > <str name="10.0.1.1:8093">_yz_pn:121 OR _yz_pn:109 OR _yz_pn:97 OR > _yz_pn:85 OR _yz_pn:73 OR _yz_pn:61 OR _yz_pn:49 OR _yz_pn:37</str> > <str name="10.0.1.4:8093">_yz_pn:115 OR _yz_pn:103 OR _yz_pn:91 OR > _yz_pn:55 OR _yz_pn:43 OR _yz_pn:31 OR _yz_pn:19 OR _yz_pn:7</str> > <str name="rows">0</str> > </lst> > </lst> > <result maxScore="6.364349" name="response" numFound="37134" > start="0"></result> > </response> > > On 11 March 2016 at 12:05, Oleksiy Krivoshey <oleks...@gmail.com> wrote: > >> So event when I fixed 3 documents which caused AAE errors, >> restarted AAE with riak_core_util:rpc_every_member_ann(yz_entropy_mgr, >> expire_trees, [], 5000). >> waited 5 days (now I see all AAE trees rebuilt in last 5 days and no AAE >> or Solr errors), I still get inconsistent num_found. >> >> For a bucket with 30,000 keys each new search request can result in >> difference in num_found for over 5,000. >> >> What else can I do to get consistent index, or at least not a 15% >> difference. >> >> I even tried to walk through all the bucket keys and modifying them in a >> hope that all Yokozuna instances in a cluster will pick them up, but no >> luck. >> >> Thanks! >> >> > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com