Re: Whole cluster times out if one node is gone

Sean Cribbs Sat, 27 Nov 2010 15:21:42 -0800

1) Riak detects node outage the same way any Erlang system does - when a 
message fails to deliver, or the heartbeat maintained by epmd fails.  The 
default timeout in epmd is 1 minute, which is probably why you're seeing it 
take 1 minute to be detected.
2) If it takes too long (the vnode is overloaded, perhaps, or is just starting 
up as a hint partition) to retrieve from any node, the request can time out.9
3) You could probably configure epmd to timeout sooner, but then you become 
more vulnerable to temporary partitions. YMMV


Sean Cribbs <s...@basho.com>
Developer Advocate
Basho Technologies, Inc.
http://basho.com/

On Nov 27, 2010, at 3:21 PM, Jay Adkisson wrote:

> Neville, I'm not sure how you mean.  The network gear is all functional, 
> otherwise I wouldn't be able to interact with the machines at all (they're at 
> our colo).  But as far as I understand, if I hard reboot a box (or, in a 
> real-world scenario, the pdu fails), the switch will happily continue 
> forwarding packets into nothingness, causing HTTP requests to hang 
> indefinitely until they time out.  From what Dan said, I would expect that 
> Riak handles that sort of situation intelligently.  I guess my remaining 
> questions are:
> 
> * How does Riak detect that a node is down, and what could cause that to take 
> a full minute?
> * When N=3, what about a single node failure could cause a read with R=1 to 
> time out?
> * Is there a way to configure the strictness of when nodes are assumed dead?  
> I'm thinking like a "timeout" config option or something.
> 
> Peace,
> --Jay
> 
> On Tue, Nov 23, 2010 at 2:55 PM, Neville Burnell <neville.burn...@gmail.com> 
> wrote:
> Just a thought ... have you verified your switch, cables, nics, etc
> 
> 
> On 24 November 2010 09:33, Jay Adkisson <j4yf...@gmail.com> wrote:
> (many profuse apologies to Dan - hit "reply" instead of "reply all")
> 
> Alrighty, I've done a little more digging.  When I throttle the writes 
> heavily (2/sec) and set R and W to 1 all around, the cluster works just fine 
> after I restart the node for about 15-20 seconds.  Then the read request 
> hangs for about a minute, until node D disappears from connected_nodes in 
> riak-admin status, at which point it returns the desired value (although 
> sometimes I get a 503):
> 
> --2010-11-23 13:01:28--  http://<node A>:8098/riak/<bucket>/<key>?r=1
> Resolving <node A>... <ip addr>
> Connecting to <node A>|<ip addr>|:8098... connected.
> HTTP request sent, awaiting response... <hang...> 200 OK
> Length: 3684 (3.6K) [image/jpeg]
> Saving to: `<key>?r=1'
> 
> 100%[======================================>] 3,684       --.-K/s   in 0s
> 
> 2010-11-23 13:02:21 (49.5 MB/s) - `<key>?r=1' saved [3684/3684]
> 
> --2010-11-23 13:02:23--  http://<node A>:8098/riak/<bucket>/<key>?r=1
> Resolving <node A>... <ip addr>
> Connecting to <node A>|<ip addr>|:8098... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: 3684 (3.6K) [image/jpeg]
> Saving to: `<key>?r=1'
> 
> 100%[======================================>] 3,684       --.-K/s   in 0s
> 
> 2010-11-23 13:02:23 (220 MB/s) - `<key>?r=1' saved [3684/3684]
> 
> Afterwards, node D comes back up and re-joins the cluster seamlessly.
> 
> Any insights?  
> 
> --Jay
> 
> On Mon, Nov 22, 2010 at 5:59 PM, Jay Adkisson <j4yf...@gmail.com> wrote:
> Hey Dan,
> 
> Thanks for the response!  I tried it again while watching `riak-admin status` 
> - basically, it takes about 30 seconds of node C being down before riak 
> realizes it's gone.  During that time, if I'm writing to the cluster at all 
> (I throttled it to 2 writes per second for testing), both writes and reads 
> hang indefinitely, and sometimes time out.
> 
> I'm using Ripple to do the writes, and wget to test reads, all on node A for 
> now, since I know it'll be up.  I'm using the default R and W options for now.
> 
> Thanks for the help and clarification around ringready.
> 
> --Jay
> 
> 
> On Mon, Nov 22, 2010 at 5:15 PM, Dan Reverri <d...@basho.com> wrote:
> Your HTTP calls should not being timing out. Are you sending requests 
> directly to the Riak node or are you using a load balancer? How much load are 
> you placing on node A? Is it a write only load or are there reads as well? 
> Can you confirm "all" requests time out or is it a large subset of the 
> requests? How large are the objects being written? Are you setting R and W in 
> the request? Are you using a particular client (Ruby, Python, etc.)? Can you 
> provide the output of "riak-admin status" from node A?
> 
> Regarding the ringready command; that is behaving as I would expect 
> considering a node is down.
> 
> Thanks,
> Dan
> 
> Daniel Reverri
> Developer Advocate
> Basho Technologies, Inc.
> d...@basho.com
> 
> 
> On Mon, Nov 22, 2010 at 4:55 PM, Jay Adkisson <j4yf...@gmail.com> wrote:
> Hey all,
> 
> Here's what I'm seeing: I have four nodes A, B, C, and D.  I'm loading lots 
> of data into node A, which is being distributed evenly across the nodes.  If 
> I physically reboot node D, all my HTTP calls time out, and `riak-admin 
> ringready` complains that not all nodes are up.  Is this intended behavior?  
> Is there a configuration option I can set so it fails more gracefully?
> 
> --Jay
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Whole cluster times out if one node is gone

Reply via email to