Also, this issue https://github.com/basho/riak_kv/issues/1188 suggests that 
adding the property `riak_kv.retry_put_coordinator_failure=false` may help in 
future. But won’t help with your keys with too many siblings.

On 24 May 2017, at 09:22, Russell Brown <russell.br...@icloud.com> wrote:

> 
> On 24 May 2017, at 09:11, Vladyslav Zakhozhai <v.zakhoz...@smartweb.com.ua> 
> wrote:
> 
>> Hello,
>> 
>> My riak cluster still experiences "too many siblings". And hinted handoffs 
>> are not able to be finished completely. So "siblings will be resolved after 
>> hinted handoffs are finished" is not my case unfortunately.
>> 
>> According to basho's docs 
>> (http://docs.basho.com/riak/kv/2.2.3/learn/concepts/causal-context/#sibling-explosion)
>>  I need to enable dvv conflict resolution mechanism. So here is a quesion:
>> 
>> Is it safe to enable dvv on default bucket type and how it affects existing 
>> data?
> 
> It might not affect existing data enough. All the existing siblings are 
> “undotted” and would need a read-put cycle to resolve.
> 
>> It may be a solution, is not it?
> 
> You may require further action. I remember basho support helping someone with 
> a similar issue, and there was some manual intervention/scripted solution, 
> but I can’t remember what it was right now. I think those objects (as logged) 
> with the sibling issues need to be read and resolved. Maybe one of the 
> ex-basho support people remembers? I’ll prod one in a back channel and see if 
> they can help.
> 
>> 
>> Why I talk about default bucket type? Because there is only one riak client 
>> - Riak CS and it does not manage bucket types of PUT'ed object (so, default 
>> bucket type always is used during PUT's). Is it correct?
> 
> Yes.
> 
>> 
>> Thank you in advance.
>> 
>> On Fri, Jun 17, 2016 at 11:45 AM Vladyslav Zakhozhai 
>> <v.zakhoz...@smartweb.com.ua> wrote:
>> Hi Russel,
>> 
>> thank you for your answer. I really appreciate your help.
>> 
>> 2.1.3 is not actually riak_kv version. It is version of basho's riak 
>> package. Versions of riak subsystems you can see below.
>> 
>> Bucket properties:
>> # riak-admin bucket-type list
>> default (active)
>> 
>> # riak-admin bucket-type status default
>> default is active
>> 
>> allow_mult: true
>> basic_quorum: false
>> big_vclock: 50
>> chash_keyfun: {riak_core_util,chash_std_keyfun}
>> dvv_enabled: false
>> dw: quorum
>> last_write_wins: false
>> linkfun: {modfun,riak_kv_wm_link_walker,mapreduce_linkfun}
>> n_val: 3
>> notfound_ok: true
>> old_vclock: 86400
>> postcommit: []
>> pr: 0
>> precommit: []
>> pw: 0
>> r: quorum
>> rw: quorum
>> small_vclock: 50
>> w: quorum
>> write_once: false
>> young_vclock: 20
>> 
>> I did not mentioned that upgrade from riak 1.5.4 have been took place couple 
>> months ago (about 6 months). As I understand DVV is disabled. Is it safe to 
>> migrate to setting DVV from Vector Clocks?
>> 
>> Package versions:
>> # dpkg -l | grep riak
>> ii  riak                                2.1.3-1                          
>> amd64        Riak is a distributed data store
>> ii  riak-cs                             2.1.0-1                          
>> amd64        Riak CS
>> 
>> Subsystems versions:
>> "clique_version" : "0.3.2-0-ge332c8f",
>> "bitcask_version" : "1.7.2",
>> "sys_driver_version" : "2.2",
>> "riak_core_version" : "2.1.5-0-gb02ab53",
>> "riak_kv_version" : "2.1.2-0-gf969bba",
>> "riak_pipe_version" : "2.1.1-0-gb1ac2cf",
>> "cluster_info_version" : "2.0.3-0-g76c73fc",
>> "riak_auth_mods_version" : "2.1.0-0-g31b8b30",
>> "erlydtl_version" : "0.7.0",
>> "os_mon_version" : "2.2.13",
>> "inets_version" : "5.9.6",
>> "erlang_js_version" : "1.3.0-0-g07467d8",
>> "riak_control_version" : "2.1.2-0-gab3f924",
>> "xmerl_version" : "1.3.4",
>> "protobuffs_version" : "0.8.1p5-0-gf88fc3c",
>> "riak_sysmon_version" : "2.0.0",
>> "compiler_version" : "4.9.3",
>> "eleveldb_version" : "2.1.10-0-g0537ca9",
>> "lager_version" : "2.1.1",
>> "sasl_version" : "2.3.3",
>> "riak_dt_version" : "2.1.1-0-ga2986bc",
>> "runtime_tools_version" : "1.8.12",
>> "yokozuna_version" : "2.1.2-0-g3520d11",
>> "riak_search_version" : "2.1.1-0-gffe2113",
>> "sys_system_version" : "Erlang R16B02_basho8 (erts-5.10.3) [source] [64-bit] 
>> [smp:4:4] [async-threads:64] [kernel-poll:true] [frame-pointer]",
>> "basho_stats_version" : "1.0.3",
>> "crypto_version" : "3.1",
>> "merge_index_version" : "2.0.1-0-g0c8f77c",
>> "kernel_version" : "2.16.3",
>> "stdlib_version" : "1.19.3",
>> "riak_pb_version" : "2.1.0.2-0-g620bc70",
>> "syntax_tools_version" : "1.6.11",
>> "goldrush_version" : "0.1.7",
>> "ibrowse_version" : "4.0.2",
>> "mochiweb_version" : "2.9.0",
>> "exometer_core_version" : "1.0.0-basho2-0-gb47a5d6",
>> "ssl_version" : "5.3.1",
>> "public_key_version" : "0.20",
>> "pbkdf2_version" : "2.0.0-0-g7076584",
>> "sidejob_version" : "2.0.0-0-gc5aabba",
>> "webmachine_version" : "1.10.8-0-g7677c24",
>> "poolboy_version" : "0.8.1p3-0-g8bb45fb",
>> "riak_api_version" : "2.1.2-0-gd8d510f",
>> "asn1_version" : "2.0.3",
>> 
>> 
>> On Fri, Jun 17, 2016 at 10:45 AM Russell Brown <russell.br...@me.com> wrote:
>> What version of riak_kv is behind this riak_cs install, please? Is it really 
>> 2.1.3 as stated below? This looks and sounds like sibling explosion, which 
>> is fixed in riak 2.0 and above. Are you sure that you are using the DVV 
>> enabled setting for riak_cs bucket properties? Can you post your bucket 
>> properties please?
>> 
>> On 16 Jun 2016, at 23:54, Vladyslav Zakhozhai <v.zakhoz...@smartweb.com.ua> 
>> wrote:
>> 
>>> Hello.
>>> 
>>> I see very interesting and confusing thing.
>>> 
>>> From my previous letter you can see that siblings count on manifest objects 
>>> is about 100 (actualy it is in range 100-300). Unfortunately my problem is 
>>> that almost all PUT requests are failing with 500 Internal Server error.
>>> 
>>> I've tried today set max_siblings riak option to 500. And there were 
>>> successfull PUT requests but not for long. Now I see in riak logs error 
>>> with "max siblings", but actual count of them is 500+ (earlier it was 
>>> 100-300 as I've mentioned).
>>> 
>>> Period of time between max_siblings=500 and errors in log is about 30 
>>> minutes. And I want to point your attention that I've forbid PUT method on 
>>> haproxy - frontend for riak cs.
>>> 
>>> 
>>> 
>>> On Mon, Jun 6, 2016 at 1:17 AM Vladyslav Zakhozhai 
>>> <v.zakhoz...@smartweb.com.ua> wrote:
>>> Hi, Luke.
>>> 
>>> Thank you for your answer. I did not understand you completely about 
>>> transfer-limit. How does it relate to my problem. Transfer limit - is a 
>>> limit of concurrent data transfer from different nodes. Am I wright? You 
>>> mean that riak can handoff one partition from several nodes concurrently?
>>> 
>>> Now I have transfer-limit 1 on all riak nodes.
>>> 
>>> But I am not sure that my cluster will be converged ever. All nodes 
>>> experiences low memory and are killed by OOM Killer periodically. I try to 
>>> add new nodes to the cluster but due problem with OOM killer this process 
>>> is very-very slow.
>>> 
>>> In the official docs I've read:
>>> 
>>> "Sibling explosion occurs when an object rapidly collects siblings that are 
>>> not reconciled. This can lead to a variety of problems, including degraded 
>>> performance, especially if many objects in a cluster suffer from siblings 
>>> explosion. At the extreme, having an enormous object in a node can cause 
>>> reads of that object to crash the entire node. Other issues include undue 
>>> latency and out-of-memory errors."
>>> 
>>> I mentioned that new nodes in the cluster do not experience such problems 
>>> (I mean out of RAM).
>>> 
>>> Regarding to siblings maybe your are right, this is manifest object. I can 
>>> recognize key name but not bucket name. But more than 100 siblings on many 
>>> keys is really confused me. Each time I try to PUT some object to Riak via 
>>> Riak CS S3 interface I got errors with siblings.
>>> 
>>> On Fri, Jun 3, 2016 at 6:43 PM Luke Bakken <lbak...@basho.com> wrote:
>>> Hi Vladyslav,
>>> 
>>> If you recognize the full name of the object raising the sibling
>>> warning, it is most likely a manifest object. Sometimes, during hinted
>>> handoff, you can see these messages. They should resolve after handoff
>>> completes.
>>> 
>>> Please see the documentation for the transfer-limit command as well:
>>> 
>>> http://docs.basho.com/riak/kv/2.1.4/using/admin/riak-admin/#transfer-limit
>>> 
>>> --
>>> Luke Bakken
>>> Engineer
>>> lbak...@basho.com
>>> 
>>> 
>>> On Fri, Jun 3, 2016 at 2:55 AM, Vladyslav Zakhozhai
>>> <v.zakhoz...@smartweb.com.ua> wrote:
>>>> Hi.
>>>> 
>>>> I have a trouble with PUT to Riak CS cluster. During this process I
>>>> periodically see the following message in Riak error.log:
>>>> 
>>>> 2016-06-03 11:15:55.201 [error]
>>>> <0.15536.142>@riak_kv_vnode:encode_and_put:2253 Put failure: too many
>>>> siblings for object OBJECT_NAME (101)
>>>> 
>>>> and also
>>>> 
>>>> 2016-06-03 12:41:50.678 [error]
>>>> <0.20448.515>@riak_api_pb_server:handle_info:331 Unrecognized message
>>>> {7345880,{error,{too_many_siblings,101}}}
>>>> 
>>>> Here OBJECT_NAME - is the name of object in Riak which has too many
>>>> siblings.
>>>> 
>>>> I definitely sure that this objects are static. Nobody deletes is, nobody
>>>> rewrites it. I have no idea why more than 100 siblings of this object
>>>> occurs.
>>>> 
>>>> The following effect of this issue occurs:
>>>> 
>>>> Great amount of keys are loaded to RAM. I almost out of RAM (Do each 
>>>> sibling
>>>> has it own key or key duplicate?).
>>>> Nodes are slow - adding new nodes are too slow
>>>> Presence of "too many siblings" affects ownership handoffs
>>>> 
>>>> So I have several questions:
>>>> 
>>>> Do hinted or ownership handoffs can affect siblings count (I mean can
>>>> siblings be created during ownership of hinted handoffs)
>>>> Is there any workaround of this issue. Do I need remove siblings manually 
>>>> or
>>>> it removes during merges, read repairs and so on
>>>> 
>>>> 
>>>> My configuration:
>>>> 
>>>> riak from basho's packages - 2.1.3-1
>>>> riak cs from basho's packages - 2.1.0-1
>>>> 24 riak/riak-cs nodes
>>>> 32 GB RAM per node
>>>> AAE is disabled
>>>> 
>>>> 
>>>> I appreciate you help.
>>>> 
>>>> _______________________________________________
>>>> riak-users mailing list
>>>> riak-users@lists.basho.com
>>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>> 
>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
>> _______________________________________________
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to