I doubt it is sibling explosion as 2.1.1 has DVV by default, and the config has 100 max siblings declared. But it may be, I guess.
Can you send me crash logs, or even the crash dump so I can get a better idea? I mean, it surely looks like a memory leak of some kind. Do you use the “write once” bucket type, or default bucket type? What bucket properties on the keys you are writing? Any queries (list_keys? 2i?) Erlang version? Built yourself, or the one shipped with riak? 4 cores but 60gb of ram, really, is this because it’s a VM? What does [frame-pointer] mean in the header output from erlang there in your first post, I’ve never seen that before? Sorry for all the questions, but at the moment I think more information is the way to go. If you want to mail me logs off list, that is fine too. Cheers Russell > On 4 Oct 2015, at 01:43, Matthew Von-Maszewski <matth...@basho.com> wrote: > > Girish, > > This feels like a sibling explosion to me. I cannot help prove or fix it. > Writing this paragraph as bait for others to help. > > Matthew > > Sent from my iPad > > On Oct 3, 2015, at 8:34 PM, Girish Shankarraman <gshankarra...@vmware.com> > wrote: > >> Thank you for the response, Jon. >> >> So I changed it to 50% and it crashed again. >> I have a 5 nodes cluster with 60GB RAM on each node. Ring size is set to 64. >> (Attached riak conf if any one has some ideas). >> >> I still see the erlang process consuming the entire capacity of the system >> (52 GB). >> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> 24256 riak 20 0 67.134g 0.052t 18740 D 0.0 90.0 2772:44 beam.smp >> >> ---- Cluster Status ---- >> Ring ready: true >> >> +--------------------+------+-------+-----+-------+ >> | node |status| avail |ring |pending| >> +--------------------+------+-------+-----+-------+ >> | (C) riak@20.0.0.11 |valid | up | 20.3| -- | >> | riak@20.0.0.12 |valid | up | 20.3| -- | >> | riak@20.0.0.13 |valid | up | 20.3| -- | >> | riak@20.0.0.14 |valid | up | 20.3| -- | >> | riak@20.0.0.15 |valid | up | 18.8| -- | >> >> Thanks, >> >> — Girish Shankarraman >> >> >> From: Jon Meredith <jmered...@basho.com> >> Date: Thursday, October 1, 2015 at 2:06 PM >> To: girish shankarraman <gshankarra...@vmware.com>, >> "riak-users@lists.basho.com" <riak-users@lists.basho.com> >> Subject: Re: riak 2.1.1 : Erlang crash dump >> >> It looks like Riak was unable to allocate 4Gb of memory. You may have to >> reduce the amount of memory allocated for leveldb from the default 70%, try >> setting this in your /etc/riak/riak.conf file >> >> leveldb.maximum_memory.percent = 50 >> >> The memory footprint for Riak should stabilize after a few hours and on >> servers with smaller amounts of memory, the 30% left over may not be enough. >> >> Please let us know how you get on. >> >> On Wed, Sep 30, 2015 at 5:31 PM Girish Shankarraman >> <gshankarra...@vmware.com> wrote: >> I have 7 node cluster for riak with a ring_size of 128. >> >> System Details: >> Each node is a VM with 16GB of memory. >> The backend is using leveldb. >> sys_system_architecture : <<"x86_64-unknown-linux-gnu">> >> sys_system_version : <<"Erlang R16B02_basho8 (erts-5.10.3) [source] [64-bit] >> [smp:4:4] [async-threads:64] [kernel-poll:true] [frame-pointer]">> >> riak_control_version : <<"2.1.1-0-g5898c40">> >> cluster_info_version : <<"2.0.2-0-ge231144">> >> yokozuna_version : <<"2.1.0-0-gcb41c27”>> >> >> Scenario: >> We have up to 400-1000 json records being written/sec. Each record might be >> a few 100 bytes. >> I see the following crash message in the erlang logs after a few hours of >> processing. Any suggestions on what could be going on here ? >> >> ===== Tue Sep 29 20:20:56 UTC 2015 >> [os_mon] memory supervisor port (memsup): Erlang has closed^M >> [os_mon] cpu supervisor port (cpu_sup): Erlang has closed^M >> ^M >> Crash dump was written to: /var/log/riak/erl_crash.dump^M >> eheap_alloc: Cannot allocate 3936326656 bytes of memory (of type "heap").^M >> >> Also tested running this at 50GB per Riak Node(VM) and things work but >> memory keeps growing, so throwing hardware at it doesn’t seem very scalable. >> >> Thanks, >> >> — Girish Shankarraman >> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> <riak.conf> >> _______________________________________________ >> riak-users mailing list >> riak-users@lists.basho.com >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com