Dear John and Russell Brown, * How fast is your turnaround time between an update and a fetch?
The turnaround time between an update and a fetch about 1 second. During my team and I debug, we adjusted haproxy with the scenario as follow: Scenario 1 : round robin via 5 nodes of cluster We meet issue at scenario 1 and we are afraid of that timeout can be occurs between nodes, make us still get stale data. Then we performed scenario 2 Scenario 2: Disable round robin and only route request to node 1. Cluster still is 5 nodes. With this case we ensure that request update and fetch always come to and from node 1. And the issue still occurs. At the fail time, I hoped that can get any error log from riak nodes to give me any information. But riak log show to me nothing and everything is ok. * What operation are you performing? I used : riakc_pb_socket:update_type(Pid, {Bucket-Type, Bucket}, Key, riakc_map:to_op(Map), []). riakc_pb_socket:fetch_type(Pid, {BucketType, Bucket}, Key, []). * It looks like the map is a single level map of last-write-wins registers. Is there a chance that the time on the node handling the update is behind the value in the lww-register? => I am not sure about logic show conflict of internal riak node. And the issue never happens if I used single node. My bucket properties as follow : {"props":{"name":"menu","active":true,"allow_mult":true,"bac kend":"bitcask_mult","basic_quorum":false,"big_vclock":50, "chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_ keyfun"},"claimant":"riak-node1@64.137.190.244","datatype":" map","dvv_enabled":true,"dw":"quorum","last_write_wins": false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapre duce_linkfun"},"n_val":3,"name":"menu","notfound_ok":tru e,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[]," pw":0,"r":"quorum","rw":"quorum","search_index":"menu_ idx","small_vclock":50,"w":"quorum","young_vclock":20}} Note : + "datatype":"map" + "last_write_wins": false + "dvv_enabled": true + "allow_mult": true * Have you tried using the `modify_type` operation in riakc_pb_socket which does the fetch/update operation in sequence for you? => I dot not use yet, but my action is sequence with fetch and then update. Might be I will try modify_type to see. * Anything in the error logs on any of the nodes? => From the node log, no errror report at fail time. * Is the opaque context identical from the fetch and then later after the update? => There is the context got from fetch and that context used with update. And during our debug time with string of sequence : fetch , update, fetch , update , .... the context I saw always the same at fetch data. Best regards, Hue Tran On Tue, Feb 7, 2017 at 2:11 AM, John Daily <jda...@basho.com> wrote: > Originally I suspected the context which allows Riak to resolve conflicts > was not present in your data, but I see it in your map structure. Thanks > for supplying such a detailed description. > > How fast is your turnaround time between an update and a fetch? Even if > the cluster is healthy it’s not impossible to see a timeout between nodes, > which could result in a stale retrieval. Have you verified whether the > stale data persists? > > A single node cluster gives an advantage that you’ll never see in a real > cluster: a perfectly synchronized clock. It also reduces (but does not > completely eliminate) the possibility of an internal timeout between > processes. > > -John > > On Feb 6, 2017, at 1:02 PM, my hue <tranmyhue.grac...@gmail.com> wrote: > > Dear Riak Team, > > I and my team used riak as database for my production with an cluster > including 5 nodes. > While production run, we meet an critical bug that is sometimes fail to > update document. > I and my colleagues performed debug and detected an issue with the > scenario as follow: > > + fetch document > + change value of document > + update document > > Repeat about 10 times and will meet fail. With the document is updated > continually, > sometimes will face update fail. > > The first time, 5 nodes of cluster we used riak version 2.1.1. > After meet above bug, we upgraded to use riak version 2.2.0 and this issue > still occurs. > > Via many time test, debug using Tcpdump at riak node : > > *tcpdump -A -ttt -i {interface} src host {host} and dst port {port} * > > And together with the command: > > *riak-admin status | grep "node_puts_map\| node_puts_map_total\| > node_puts_total\| vnode_map_update_total\| vnode_puts_total\"* > > we got that the riak server already get the update request. > However, do not know why riak backend fail to update document. > At the fail time, from riak server log everything is ok. > > Then we removed cluster and use a single riak server, and see that above > bug never happen. > > For that reason, think that is only happen with cluster work. We took > research on basho riak document and our riak configure > seems that like suggestion from document. We totally blocked on this > issue and hope that can get support from you > so that can obtain a stable work from riak database for our production. > Thank you so much. Hope that can get your reply soon. > > > * The following is our riak node information : > > Riak version: 2.2.0 > OS : CentOS Linux release 7.2.1511 > Cpu : 4 core > Memory : 4G > Riak configure : the attached file "riak.conf" > > *Note :* > > - We mostly using default configure of riak configure except that we used > storage backend is multi > > storage_backend = multi > multi_backend.bitcask_mult.storage_backend = bitcask > multi_backend.bitcask_mult.bitcask.data_root = /var/lib/riak/bitcask_mult > multi_backend.default = bitcask_mult > > ------------------------------------------------------------ > ----------------------------------------------------------------- > > - Bucket type created with the following command: > > riak-admin bucket-type create dev_restor '{"props":{"backend":"bitcask_ > mult","datatype":"map"}}' > riak-admin bucket-type activate dev_restor > > ------------------------------------------------------------ > ----------------------------------------------------------------- > > - Bucket Type Status : > > >> riak-admin bucket-type status dev_restor > > dev_restor is active > young_vclock: 20 > w: quorum > small_vclock: 50 > rw: quorum > r: quorum > pw: 0 > precommit: [] > pr: 0 > postcommit: [] > old_vclock: 86400 > notfound_ok: true > n_val: 3 > linkfun: {modfun,riak_kv_wm_link_walker,mapreduce_linkfun} > last_write_wins: false > dw: quorum > dvv_enabled: true > chash_keyfun: {riak_core_util,chash_std_keyfun} > big_vclock: 50 > basic_quorum: false > backend: <<"bitcask_mult">> > allow_mult: true > datatype: map > active: true > claimant: 'riak-node1@64.137.190.244' > > ------------------------------------------------------------ > ----------------------------------------------------------------- > > - Bucket Property : > > {"props":{"name":"menu","active":true,"allow_mult":true,"bac > kend":"bitcask_mult","basic_quorum":false,"big_vclock":50, > "chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_ > keyfun"},"claimant":"riak-node1@64.137.190.244","datatype":" > map","dvv_enabled":true,"dw":"quorum","last_write_wins": > false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapre > duce_linkfun"},"n_val":3,"name":"menu","notfound_ok":tru > e,"old_vclock":86400,"postcommit":[],"pr":0,"precommit":[]," > pw":0,"r":"quorum","rw":"quorum","search_index":"menu_ > idx","small_vclock":50,"w":"quorum","young_vclock":20}} > > > ------------------------------------------------------------ > ----------------------------------------------------------------- > > - Member status : > > >> riak-admin member-status > > ================================= Membership > ================================== > Status Ring Pending Node > ------------------------------------------------------------ > ------------------- > valid 18.8% -- 'riak-node1@64.137.190.244' > valid 18.8% -- 'riak-node2@64.137.247.82' > valid 18.8% -- 'riak-node3@64.137.162.64' > valid 25.0% -- 'riak-node4@64.137.161.229' > valid 18.8% -- 'riak-node5@64.137.217.73' > ------------------------------------------------------------ > ------------------- > Valid:5 / Leaving:0 / Exiting:0 / Joining:0 / Down:0 > > > ------------------------------------------------------------ > ----------------------------------------------------------------- > > - Ring > > >> riak-admin status | grep ring > > ring_creation_size : 64 > ring_members : ['riak-node1@64.137.190.244','riak-node2@64.137.247.82', ' > riak-node3@64.137.162.64','riak-node4@64.137.161.229', ' > riak-node5@64.137.217.73'] > ring_num_partitions : 64 > ring_ownership : <<"[{'riak-node2@64.137.247.82',12},\n {' > riak-node5@64.137.217.73',12},\n {'riak-node1@64.137.190.244',12},\n {' > riak-node3@64.137.162.64',12},\n {'riak-node4@64.137.161.229',16}]">> > rings_reconciled : 0 > rings_reconciled_total : 31 > > ------------------------------------------------------------ > ----------------------------------------------------------------- > > * The riak client : > > + riak-erlang-client: https://github.com/basho/riak-erlang-client > + release : 2.4.2 > > ------------------------------------------------------------ > ----------------------------------------------------------------- > > * Riak client API used: > > + Insert/Update: > > riakc_pb_socket:update_type(Pid, {Bucket-Type, Bucket}, Key, > riakc_map:to_op(Map), []). > > + Fetch : > > riakc_pb_socket:fetch_type(Pid, {BucketType, Bucket}, Key, []). > > ------------------------------------------------------------ > ----------------------------------------------------------------- > > * Step to perform an update : > > - Fetch document > - Update document > > ------------------------------------------------------------ > ----------------------------------------------------------------- > > * Data got from fetch_type: > > {map, [{{<<"account_id">>,register}, <<"accounta25a424b8484181e8ba1bec > 25bf7c491">>}, > {{<<"created_by_id">>,register}, > <<"accounta25a424b8484181e8ba1bec25bf7c491">>}, > {{<<"created_time_dt">>,register},<<"2017-01-27T03:34:04Z">>}, > {{<<"currency">>,register},<<"cad">>}, > {{<<"end_time">>,register},<<"dont_use">>}, > {{<<"id">>,register},<<"menufe89488afa948875cab6b0b18d579f21">>}, > {{<<"maintain_mode_b">>,register},<<"false">>}, > {{<<"menu_category_revision_id">>,register}, <<"0- > 634736bc14e0bd3ed7e3fe0f1ee64443">>}, {{<<"name">>,register},<<"fullmenu">>}, > {{<<"order_i">>,register},<<"0">>}, {{<<"rest_location_p">>,register}, > <<"10.844117421366443,106.63982392275398">>}, {{<<"restaurant_id">>,register}, > <<"rest848e042b3a0488640981c8a6dc4a8281">>}, > {{<<"restaurant_status_id">>,register},<<"inactive">>}, > {{<<"start_time">>,register},<<"dont_use">>}, > {{<<"status_id">>,register},<<"hide">>}, {{<<"updated_by_id">>,register}, > <<"accounta25a424b8484181e8ba1bec25bf7c491">>}, {{<<"updated_time_dt">>, > register},<<"2017-02-06T17:22:39Z">>}], > [], > [], <<131,108,0,0,0,3,104,2,109,0,0,0,12,39,21,84,209,219,42,57, > 233,0,0,156,252,97,34,104,2,109,0,0,0,12,132,107,248,226, > 103,5,182,208,0,0,118,2,97,40,104,2,109,0,0,0,12,137,252, > 139,186,176,202,25,96,0,0,195,164,97,54,106>>} > > > * Update with update_type > > Below is Map data before using riakc_map:to_op(Map) : > > {map, [] , > [{{<<"account_id">>,register},{register,<<>>,<<" > accounta25a424b8484181e8ba1bec25bf7c491">>}},{{<<"created_ > by_id">>,register},{register,<<>>,<<"accounta25a424b8484181e8ba1bec > 25bf7c491">>}},{{<<"created_time_dt">>,register},{ > register,<<>>,<<"2017-01-27T03:34:04Z">>}},{{<<"currency">>,register},{ > register,<<>>,<<"cad">>}},{{<<"end_time">>,register},{ > register,<<>>,<<"dont_use">>}},{{<<"id">>,register},{register,<<>>,<<" > menufe89488afa948875cab6b0b18d579f21">>}},{{<<"maintain_ > mode_b">>,register},{register,<<>>,<<"false">>}},{{<<"menu_ > category_revision_id">>,register},{register,<<>>,<<"0- > 634736bc14e0bd3ed7e3fe0f1ee64443">>}},{{<<"name">>,register} > ,{register,<<>>,<<"fullmenu">>}},{{<<"order_i">>,register},{ > register,<<>>,<<"0">>}},{{<<"rest_location_p">>,register},{ > register,<<>>,<<"10.844117421366443,106.63982392275398">>}},{{<<" > restaurant_id">>,register},{register,<<>>,<<" > rest848e042b3a0488640981c8a6dc4a8281">>}},{{<<"restaurant_ > status_id">>,register},{register,<<>>,<<"inactive">>}} > ,{{<<"start_time">>,register},{register,<<>>,<<"dont_use">>} > },{{<<"status_id">>,register},{register,<<>>,<<"show">>}},{{ > <<"updated_by_id">>,register},{register,<<>>,<<" > accounta25a424b8484181e8ba1bec25bf7c491">>}},{{<<"updated_ > time_dt">>,register},{register,<<>>,<<"2017-02-06T17:22:39Z">>}}], > [] , <<131,108,0,0,0,3,104,2,109,0,0,0,12,39,21,84,209,219,42, > 57,233,0,0,156,252,97,34,104,2,109,0,0,0,12,132,107,248, > 226,103,5,182,208,0,0,118,2,97,39,104,2,109,0,0,0,12,137, > 252,139,186,176,202,25,96,0,0,195,164,97,53,106>> > } > > > > > - > > Best regards, > Hue Tran > <riak.conf>_______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com