Re: simulating physical node crash

Kelly McLaughlin Fri, 18 Nov 2011 15:51:34 -0800

Francisco,

The problem you are experiencing is not due to search, but seems likely due to 
the way in which the nodes of your cluster have been assigned the partitions of 
the ring. If only one node has failed and you are getting that 
no_candidate_nodes error it means that the preference list of at least one of 
the mapreduce inputs is comprised entirely of partitions on that one downed 
node. In other words, the three replicas of some of your data have all ended up 
on the same physical node. This is an unusual circumstance so it would be good 
to examine the conents of your ring so we can verify this is the case. If you 
could use riak attach to get a console on one of your nodes and run the 
following command and put the output in a gist or paste bin, it will hopefully 
shed light on the problem.


        io:format("~p~n", [riak_core_ring_manager:get_my_ring()]).

The problem Martin described is slightly different. The problem he observed 
appears to be that the erlang processes to handle the work of the map phase 
were dying. Since he described the cluster as being under heavy load when this 
happened I suspect it is related to hitting a resource limit in the erlang vm 
or the operating system. My suggestion in this case is to use riak 1.0.2 and 
the pipe mapreduce system. It is much more robust for clusters under heavy load 
and will provide a better experience. Hope that helps.

Kelly


On Nov 17, 2011, at 4:10 PM, francisco treacy wrote:

> This morning one node went down (3-node 0.14 cluster) and I started getting 
> the dreaded `no_candidate_nodes,exhausted_prefist` error posted earlier.
> 
> If 2 nodes are remaining, and I always use N=3 R=1 ... why is it failing? 
> Something to do with my use of Search?
> 
> Thanks
> Francisco
> 
> 
> 2011/9/28 Martin Woods <mw2...@gmail.com>
> Hi Francisco
> 
> I've seen the same error in a dev environment running on a single Riak node 
> with an n_val of 1, so in my case it was nothing to do with a failing node. I 
> wasn't running Riak Search either. I posted a question about it to this list 
> a week or so ago but haven't seen a reply yet. 
> 
> So indeed, does anyone know what's causing this error and how we can avoid it?
> 
> Regards,
> Martin. 
> 
> 
> 
> On 28 Sep 2011, at 20:39, francisco treacy <francisco.tre...@gmail.com> wrote:
> 
>> Regarding (3) I found a Forcing Read Repair contrib function 
>> (http://contrib.basho.com/bucket_inspector.html) which should help.
>> 
>> Otherwise for the m/r error, all of my buckets use default n_val and write 
>> quorum. Could it be that some data never reached that particular node in the 
>> cluster? That is, should've I used W=3?  During the failure, many assets 
>> were returning 404s which triggered read-repair (and were ok upon subsequent 
>> request), but no luck with the Map/Reduce function (it kept on failing).  
>> Could it have something to do with Riak Search?
>> 
>> Thanks,
>> 
>> Francisco
>> 
>> 
>> 2011/9/26 francisco treacy <francisco.tre...@gmail.com>
>> Hi all,
>> 
>> I have a 3-node Riak cluster, and I am simulating the scenario of physical 
>> nodes crashing.
>> 
>> When 2 nodes go down, and I query the remaining one, it fails with:
>> 
>> {error,
>>     {exit,
>>         {{{error,
>>               {no_candidate_nodes,exhausted_prefist,
>>                   [{riak_kv_mapred_planner,claim_keys,3},
>>                    {riak_kv_map_phase,schedule_input,5},
>>                    {riak_kv_map_phase,handle_input,3},
>>                    {luke_phase,executing,3},
>>                    {gen_fsm,handle_msg,7},
>>                    {proc_lib,init_p_do_apply,3}],
>>                   []}},
>>           {gen_fsm,sync_send_event,
>>               [<0.31566.2330>,
>>                {inputs,
>> 
>> (...)
>> 
>> Here I'm doing a M/R, inputs being fed by Search.
>> 
>> (1) All of the involved buckets have N=3, and all involved requests R=1 (I 
>> don't really need quorum for this usecase)
>> 
>> Why is it failing? I'm sure i'm missing something basic here
>> 
>> (2) Probably worth noting, those 3 nodes are spread across *two* physical 
>> servers (1 on small one, 2 on beefier one). I've heard it is "not a good 
>> idea", not sure why though. These two servers are definitely enough still 
>> for our current load; should I consider adding a third one?
>> 
>> (3) To overcome the aforementioned error, I added a new node to the cluster 
>> (installed on the small server). Now the setup looks like: 4 nodes = 2 on 
>> small server, 2 on beefier one.
>> 
>> When 2 nodes go down, this works.  Which brings me to another topic... could 
>> you point me to good strategies to "pre-" invoke read-repair? Is it up to 
>> clients to scan the keyspace forcing reads?  It's a disaster usability-wise 
>> when first users start getting 404s all over the place.
>> 
>> Francisco
>> 
>> _______________________________________________
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: simulating physical node crash

Reply via email to