I forgot to cc to riak-users. ---------- Forwarded message ---------- From: Tatsuya Kawano <t650...@gmail.com> Date: 2012/9/27 Subject: Re: Reading with "r = all" always succeeds in Riak 1.2 even when one of the primary nodes is down? To: Mike Oxford <moxf...@gmail.com>
Hi Mike, Thanks for the detailed info. I'm currently running all Riak node on one box. So I'll try to get more boxes and try to pull the network cable out. > Did you "shut down" the node or kill it by brutally powering the box down or > yanking the network cable? I only tried to kill the Erlang precess. > It possible that Riak noticed the node_down and had already done the > recovery. While net_ticktime can be as long as 60 seconds by default, it's > possible that you're hitting the case where you kill it and before you > re-run the read it's already noticed and 'fixed itself?' OK. I think "this fixed by itself" behavior is not documented in the Riak Wiki(?). Now I understand why r=all didn't fail. Thanks! Tatsuya 2012/9/27 Mike Oxford <moxf...@gmail.com>: > What is the time between "node down" and "read with R=3" ? > Did you "shut down" the node or kill it by brutally powering the box down or > yanking the network cable? > > It possible that Riak noticed the node_down and had already done the > recovery. While net_ticktime can be as long as 60 seconds by default, it's > possible that you're hitting the case where you kill it and before you > re-run the read it's already noticed and 'fixed itself?' > > Also, if you do a shutdown, the erlang VM is probably linked/monitoring and > being notified that the node is shutting down so it's triggering the > rebalance immediately. > > Try by pulling the network cable out of that node. "/sbin/ifconfig eth0 > down" **may** give you the same effect. > > -mox > > On Wed, Sep 26, 2012 at 2:51 PM, Tatsuya Kawano <t650...@gmail.com> wrote: >> >> Hi, >> >> I'm having hard time to verify this behavior on the Riak wiki with my >> Riak 1.2 test environment. Can anybody help me to figure out what is >> happening? >> >> >> http://wiki.basho.com/Eventual-Consistency.html#Failure-Scenarios >> >> > Reading When One Primary Fails >> > ------------------------------ >> > >> > 1. Data is written to a key with W=3 >> > 2. One node goes down, it happens to be a primary for that key >> > 3. Data is read from that key with R=3 >> > 4. Riak returns not_found on first request >> > 5. Read repair ensures data is replicated to a secondary node. >> > Read repair will always occur, regardless of the R value. >> > Even with an R of 2, read repair will kick in and ensure that >> > all nodes responsible for this particular data are consistent. >> > 6. Subsequent reads return correct value with R=3, two values >> > coming from primary and one from secondary nodes >> >> >> At the first read (step 4), Riak should return not_found, but it >> actually retuns the correct value. I wonder when read repair will kick >> in in Riak 1.2. (even before the first read?) >> >> >> I followed the screencast "Tuning CAP Controls in Riak" on this page. >> http://wiki.basho.com/Tunable-CAP-Controls-in-Riak.html >> >> I used riak_core_ring:preflist/2 to ensure that I had took down one of >> the correct primary nodes for the key. >> >> Thanks, >> Tatsuya >> >> -- >> Tatsuya Kawano (Mr.) >> Tokyo, Japan >> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com