On 1 Oct 2015, at 15:45, Zuzana Zatrochova <zatroch...@gmail.com> wrote:
> Thank you for fast reply, > > Could you please specify what do you mean by context sent by client? When a client reads a value it gets an opaque context too, it must return this to riak when it performs an update. In the absence of a context an empty context is assumed. > Do you mean update on the existing object in database? Yes. > > I see exactly that when allow_mult=false, only the highest timestamp value is > stored. > > For me the results are unexpected because the client sees inconsistent values > (not from the last write) but there are no partitions and quorum is set to > the strongest consistency configurations. In the diagram, it is showed more > clearly how shifted clocks generate inconsistent result. Inconsistent as in non-deterministic, or just not what you expected? > > Thanks, > Zuzana > > On 1 October 2015 at 14:18, Russell Brown <russell.br...@me.com> wrote: > I need more time to examine the diagram, but this all looks as expected so > far. > > If a client sends no context then it’s write will be a sibling of whatever is > stored at the coordinator, as you rightly point out riak treats an incoming > clock that is less than a local clock as a sibling. > If the coordinator is configured to not store siblings then the sibling value > with the highest timestamp is stored, I recommend you run riak in either > allow_mult=true or LWW=true, allow_mult=false, in my view, should not be > default. > If two riak nodes do the above, and then replicate their values, the single > value with the highest value is stored. Isn’t this what you are seeing? If > you depend on time to pick the latest, and nodes’ clocks are out of sync this > is the price. > > Is this what you are seeing? Are you seeing results you didn’t expect, or > non-deterministic results? Or both? > > Regards > > Russell > > On 1 Oct 2015, at 12:58, Zuzana Zatrochova <zatroch...@gmail.com> wrote: > > > Hi, > > > > > > > > We are researching the client-centric consistency features of Riak > > database. We encountered a problem with vector clocks implementation. The > > vector clocks do not seem to work locally on a machine as expected. We > > would like you to confirm if the behavior is desired. First I will describe > > the environment of our experiments and then the problem will be presented. > > > > > > > > Environment: > > > > > > • Our environment consists of six virtual machines > > • five machines in Riak cluster, each represent a single Riak > > node with Riak database > > • one machine with java application that simulates multiple > > clients communicating with Riak database > > • Machines are Virtualized VMs by VMware software and have slightly > > shifted time to each other (no more than 1 second) > > • We made experiments with versions riak-1.4.8 and riak-2.1.1. In > > riak-1.4.8 app_config contains vnode_vclocks = true (default setting that > > was there when downloaded) in riak-2.1.1 we could not locate configuration > > for vnode vclocks either in advanced configurations in documentation or > > riak.conf so we assumed it also defaults to true and is no longer enabled > > to change > > • For each experiment we have 500 clients concurrently sending > > requests to random node from the cluster. There are 20000 requests per > > minute operating only on 20 different keys (load on single key is 16 > > requests per second (read:write ration = 50:50). > > • For referenced issue we used quorums R = 1, W = 3; R = 2, W = 2 and > > R =3 W = 1 > > • All riak settings are default apart from IP settings and quorum > > settings. We added interceptors from riak_test module that don’t change the > > code and are implemented only for logging purposes (information about > > states of nodes), error.log is empty > > > > Problem: > > > > > > • It seems that Riak does not use vector clocks locally, only on > > global scale. When a data object is created on client side and sent to Riak > > database it does not have any vector clocks assigned (more precisely the > > function riak_object:vclock(UpdObj) = [] and local object: > > riak_object:vclock(LocalObj) returns the local VC for the local object. > > Therefore the function (in 2.1.1 but similar behavior is in 1.4.8) > > vclock:descends(NewObject, LocalObject) returns false for all my > > experiments with different quorums (Empty vector clocks cannot descend non > > empty vector clocks). The behavior leads to merge of contents = creation of > > siblings (or resolving the value according to the timestamp not vector > > clocks when siblings are not allowed – our configuration) > > • In our experiments when time on VMs is not synchronized up to 500 > > milliseconds the situation from picture issue.png sent in attachment > > arises. Due to the fact that two objects with the same key are sent to two > > different coordinators and coordinators clocks are shifted the later object > > is assigned earlier timestamp as the object that was sent before. As the > > result of the vector clocks implementation in Riak, the later object is > > lost due to the merge of contents where later timestamp (wrong because of > > local clock shift) is evaluated as the latest. > > > > The question: > > > > > > > > Is this the Riak intended behavior? The problem is that even when quorum is > > set to prefer consistency and there are no partitions in the cluster there > > are still inconsistent requests seen from client perspective = any read > > must return the value of the latest finished write or later unfinished > > write request. (We did not use the strong_consistency feature of riak-2.1.1 > > version). > > > > > > > > Thank you, > > > > Zuzana > > > > <issue.png>_______________________________________________ > > riak-users mailing list > > riak-users@lists.basho.com > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com