Re: Riak Cluster Behavior - Clarification

Vikram Lalit Tue, 31 May 2016 14:10:36 -0700

Hi Alexander - thanks again for your inputs here. I believe the problem
here was with the sloppy quorum coming in when I brought down 1 node. The
failover node would get active and from what I understand from the
documentation, since the failover did not have the key and could
potentially respond faster than the fetch, I randomly failed to fetch any
data till all the read repairs completed.


Understand this much better now - many thanks once again...

On Tue, May 24, 2016 at 4:24 PM, Alexander Sicular <sicul...@gmail.com>
wrote:

> Hi Vikram,
>
> If you're using the defaults, two of copies may be on the same
> machine. When using the default values (ring_size=64, n_val=3) you are
> not guaranteed copies on distinct physical machines. Implement a
> back-off retry design pattern. aka, fail once, try again with r=1.
> Also, a read will trigger a read repair operation which will then copy
> your data n_val times to surviving members of the cluster.
>
> Have you tried that?
> -Alexander
>
> Read these blog posts for more info:
>
> http://basho.com/posts/technical/understanding-riaks-configurable-behaviors-part-1/
> http://basho.com/posts/technical/riaks-config-behaviors-part-2/
> http://basho.com/posts/technical/riaks-config-behaviors-part-3/
> http://basho.com/posts/technical/riaks-config-behaviors-part-4/
>
> On Tue, May 24, 2016 at 3:08 PM, Vikram Lalit <vikramla...@gmail.com>
> wrote:
> > It's returning no object at all for the relevant key. That too is random
> -
> > every few calls it returns but then it doesn't.
> >
> > On May 24, 2016 4:06 PM, "Sargun Dhillon" <sar...@sargun.me> wrote:
> >>
> >> What do you mean it's not returning? It's returning stale data? Or
> >> it's erroring?
> >>
> >> On Tue, May 24, 2016 at 7:34 AM, Vikram Lalit <vikramla...@gmail.com>
> >> wrote:
> >> > Hi - I'd appreciate if someone can opine on the below behavior of Riak
> >> > that
> >> > I am observing... is that expected, or something wrong in my set-up /
> >> > understanding?
> >> >
> >> > To summarize, I have a 3-node Riak cluster (separate EC2 AWS
> instances)
> >> > with
> >> > a separate chat server connecting to them. When I write data on the
> Riak
> >> > nodes, the process is successful and I can read all data correctly.
> >> > However,
> >> > as part of my testing, if I deliberately bring down one node (and then
> >> > remove it from the cluster using riak-admin cluster force-remove /
> plan
> >> > /
> >> > commit), the client API is not able to fetch all the written data. In
> >> > fact,
> >> > there is an alternation of success and failure which happens rather
> >> > randomly.
> >> >
> >> > My initial suspicion was that it would be happening only during the
> time
> >> > the
> >> > rebalancing is occurring (i.e. riak-admin ring-status is not fully
> >> > settled)
> >> > but I've seen this sporadic behavior post the same too.
> >> >
> >> > Does this have to do with the n and r values for the cluster and given
> >> > that
> >> > 1 node is down, the cluster does not succeed in returning results
> >> > reliably?
> >> > Also, does this mean that during the time a cluster is being
> rebalanced
> >> > (even incl. addition of new nodes), the results could be arbitrary -
> >> > that
> >> > doesn't sound correct to me?
> >> >
> >> > Appreciate if someone can throw some light here? Also, the HTTP API
> >> > calls to
> >> > retrieve and set the n / r / w values for a specific bucket - couldn't
> >> > locate the same!
> >> >
> >> > Thanks much!
> >> > Vikram
> >> >
> >> >
> >> > _______________________________________________
> >> > riak-users mailing list
> >> > riak-users@lists.basho.com
> >> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >> >
> >
> >
> > _______________________________________________
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
>

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Riak Cluster Behavior - Clarification

Reply via email to