Hmmm, definitely not great... but thanks Ryan for the explanation Francisco
2011/8/23 Ryan Zezeski <rzeze...@basho.com> > Gordon, > > The reason for the nondeterministic behavior is two-fold. > > 1. For performance reasons Search only ever reads from 1 node (R=1) > > 2. As an attempt to balance load and reduce vnode contention this node is > selected randomly > > This is why it works 50% of the time. Because now, for each index entry, 2 > partitions have the data and 1 does not. So depending on which one you hit > you'll get the data or not. Furthermore, this behavior will continue until > you reindex because the index in Search has no form of anti-entropy such as > read repair or merkle trees. > > In the future the easiest thing is to replace that lost node as quickly as > possible. While it's down the other nodes will keep track of the new index > entries and will transfer them during data handoff when the node comes alive > again. By removing the node you've changed the ring and your only option is > to reindex as you are already doing. I realize that bringing that node up > or replacing it may not have been an option but this is the only way to > avoid this problem with Search as it stands today. > > I realize this sucks and isn't in line with Riak's more fault tolerant > behavior. It does suck. I hate the fact that I have to write this email > basically telling you this part of Search is broken, IMO. I want to see it > addressed and I'm sure I'm not the only one. Right now our internal ticket > board is buzzing in anticipation for the new release. After that there is a > lot of love I want to give Search, this particular issue included. I'd say > it's only a matter of time. > > > -Ryan > > On Fri, Aug 19, 2011 at 2:46 PM, Gordon Tillman <gtill...@mezeo.com>wrote: > >> Greetings all, >> >> After an extended datacenter power outage, a 3-node Riak cluster shut >> down. When the power was restored, two of the three nodes came back up. >> Don't know what is going on with the third node. But in the mean time, have >> removed the dead node from the ring. The two remaining nodes show a good >> ringready status. >> >> The problem is that the search indexes appear to be in an inconsistent >> state. For example, I can issue the same solr query on one of the nodes and >> 50% of the time it returns correct results. The other times it returns an >> empty result set. >> >> I'm in the process of re-indexing the bucket in question (a very >> time-consuming affair). But I wonder if anyone could shed some light on >> this situation as to why it occurred in the first place and if there is >> anything that can be done to keep this from happening again in the future. >> >> Many thanks, >> >> --gordon >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> > > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com