Hi Vikram, If you're using the defaults, two of copies may be on the same machine. When using the default values (ring_size=64, n_val=3) you are not guaranteed copies on distinct physical machines. Implement a back-off retry design pattern. aka, fail once, try again with r=1. Also, a read will trigger a read repair operation which will then copy your data n_val times to surviving members of the cluster.
Have you tried that? -Alexander Read these blog posts for more info: http://basho.com/posts/technical/understanding-riaks-configurable-behaviors-part-1/ http://basho.com/posts/technical/riaks-config-behaviors-part-2/ http://basho.com/posts/technical/riaks-config-behaviors-part-3/ http://basho.com/posts/technical/riaks-config-behaviors-part-4/ On Tue, May 24, 2016 at 3:08 PM, Vikram Lalit <vikramla...@gmail.com> wrote: > It's returning no object at all for the relevant key. That too is random - > every few calls it returns but then it doesn't. > > On May 24, 2016 4:06 PM, "Sargun Dhillon" <sar...@sargun.me> wrote: >> >> What do you mean it's not returning? It's returning stale data? Or >> it's erroring? >> >> On Tue, May 24, 2016 at 7:34 AM, Vikram Lalit <vikramla...@gmail.com> >> wrote: >> > Hi - I'd appreciate if someone can opine on the below behavior of Riak >> > that >> > I am observing... is that expected, or something wrong in my set-up / >> > understanding? >> > >> > To summarize, I have a 3-node Riak cluster (separate EC2 AWS instances) >> > with >> > a separate chat server connecting to them. When I write data on the Riak >> > nodes, the process is successful and I can read all data correctly. >> > However, >> > as part of my testing, if I deliberately bring down one node (and then >> > remove it from the cluster using riak-admin cluster force-remove / plan >> > / >> > commit), the client API is not able to fetch all the written data. In >> > fact, >> > there is an alternation of success and failure which happens rather >> > randomly. >> > >> > My initial suspicion was that it would be happening only during the time >> > the >> > rebalancing is occurring (i.e. riak-admin ring-status is not fully >> > settled) >> > but I've seen this sporadic behavior post the same too. >> > >> > Does this have to do with the n and r values for the cluster and given >> > that >> > 1 node is down, the cluster does not succeed in returning results >> > reliably? >> > Also, does this mean that during the time a cluster is being rebalanced >> > (even incl. addition of new nodes), the results could be arbitrary - >> > that >> > doesn't sound correct to me? >> > >> > Appreciate if someone can throw some light here? Also, the HTTP API >> > calls to >> > retrieve and set the n / r / w values for a specific bucket - couldn't >> > locate the same! >> > >> > Thanks much! >> > Vikram >> > >> > >> > _______________________________________________ >> > riak-users mailing list >> > riak-users@lists.basho.com >> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> > > > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com