Hello Byron - m3.large instances only support 7.5 GiB of RAM. You can see that Riak crashed while attempting to allocate 2.12 GiB of RAM for leveldb.
I suggest decreasing jvm (Solr) RAM back to the 1GiB setting that ships with Riak. You can also experiment with disabling Active Anti-Entropy to reduce memory usage. Hopefully someone with more experience with Riak Search (Yokozuna) interaction with Active Anti-Entropy will chime in on this thread. Or, increase the amount of RAM available to these VMs. Thanks -- Luke Bakken Engineer lbak...@basho.com On Mon, Jan 25, 2016 at 10:10 AM, Sakoulas, Byron <byronsakou...@catholichealth.net> wrote: > We are running an 8 node cluster of riak at AWS, and our nodes are > consistently crashing with the error - Cannot allocate x bytes of memory (of > type "heap”). > > Here are some of the specs for our env: > > 8 nodes - running on M3 Larges > Level DB with 50% allocated > Solr with 2Gig > We use only Immutable and CRDT data > We have a Custom search schema > System config matches basho recommendations > CentOs 7 > Riak 2.0.2 > Riak java client 2.0.0 > > Below is the console log leading up to the crash. I have also attached the > erl_crash.dump file. Any help is greatly appreciated. > > 2016-01-25 16:34:16.822 [info] > <0.2681.4>@riak_kv_exchange_fsm:key_exchange:263 Repaired 1 keys during > active anti-entropy exchange of > {707914855582156101004909840846949587645842325504,3} between > {730750818665451459101842416358141509827966271488,'riakaws@172.16.65.8<mailto:'riakaws@172.16.65.8>'} > and > {753586781748746817198774991869333432010090217472,'riakaws@172.16.65.12<mailto:'riakaws@172.16.65.12>'} > 2016-01-25 16:34:56.867 [info] > <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap > <0.1369.0> > [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,180682}] > > [{old_heap_block_size,0},{heap_block_size,22177879},{mbuf_size,0},{stack_size,26},{old_heap_size,0},{heap_size,8755966}] > 2016-01-25 16:35:00.231 [info] > <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap > <0.1369.0> > [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,203278}] > > [{old_heap_block_size,0},{heap_block_size,26613454},{mbuf_size,0},{stack_size,26},{old_heap_size,0},{heap_size,9839470}] > 2016-01-25 16:35:08.857 [info] > <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap > <0.1369.0> > [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,256704}] > > [{old_heap_block_size,0},{heap_block_size,31936144},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,12371527}] > 2016-01-25 16:35:15.731 [info] > <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap > <0.1369.0> > [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,299047}] > > [{old_heap_block_size,0},{heap_block_size,38323372},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,14501169}] > 2016-01-25 16:35:21.285 [info] > <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap > <0.1369.0> > [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,330848}] > > [{old_heap_block_size,0},{heap_block_size,45988046},{mbuf_size,0},{stack_size,26},{old_heap_size,0},{heap_size,16029792}] > 2016-01-25 16:35:36.034 [info] > <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap > <0.1369.0> > [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,382846}] > > [{old_heap_block_size,0},{heap_block_size,55185655},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,18521726}] > 2016-01-25 16:35:49.409 [info] > <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap > <0.1369.0> > [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,455689}] > > [{old_heap_block_size,0},{heap_block_size,66222786},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,21841438}] > 2016-01-25 16:35:59.878 [info] <0.71.0> alarm_handler: > {set,{process_memory_high_watermark,<0.1369.0>}} > 2016-01-25 16:36:00.267 [info] > <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap > <0.1369.0> > [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,515497}] > > [{old_heap_block_size,0},{heap_block_size,79467343},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,24737674}] > 2016-01-25 16:36:08.497 [info] > <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap > <0.1369.0> > [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{hashtree,should_insert,3}},{message_queue_len,560639}] > > [{old_heap_block_size,0},{heap_block_size,95360811},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,26973030}] > 2016-01-25 16:36:34.806 [info] > <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap > <0.1369.0> > [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,691363}] > > [{old_heap_block_size,0},{heap_block_size,114432973},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,33336504}] > 2016-01-25 16:36:55.523 [info] > <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap > <0.1369.0> > [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,809402}] > > [{old_heap_block_size,0},{heap_block_size,137319567},{mbuf_size,0},{stack_size,26},{old_heap_size,0},{heap_size,38698478}] > 2016-01-25 16:37:10.427 [info] > <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap > <0.1369.0> > [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,897252}] > > [{old_heap_block_size,0},{heap_block_size,164783480},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,43053480}] > 2016-01-25 16:37:46.837 [info] > <0.10112.4>@riak_kv_exchange_fsm:key_exchange:263 Repaired 1 keys during > active anti-entropy exchange of > {959110449498405040071168171470060731649205731328,3} between > {959110449498405040071168171470060731649205731328,'riakaws@172.16.65.8<mailto:'riakaws@172.16.65.8>'} > and > {1004782375664995756265033322492444576013453623296,'riakaws@172.16.65.13<mailto:'riakaws@172.16.65.13>'} > 2016-01-25 16:37:56.113 [info] > <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap > <0.1369.0> > [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{hashtree,maybe_flush_buffer,1}},{message_queue_len,1132569}] > > [{old_heap_block_size,0},{heap_block_size,197740176},{mbuf_size,0},{stack_size,26},{old_heap_size,0},{heap_size,54503137}] > 2016-01-25 16:38:29.550 [info] > <0.94.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap > <0.1369.0> > [{initial_call,{yz_index_hashtree,init,1}},{almost_current_function,{eleveldb,get,3}},{message_queue_len,1313878}] > > [{old_heap_block_size,0},{heap_block_size,237288211},{mbuf_size,0},{stack_size,19},{old_heap_size,0},{heap_size,62782512}] > > > erlang.log.1: > ===== ALIVE Mon Jan 25 16:34:01 UTC 2016 > > ===== Mon Jan 25 16:39:55 UTC 2016 > [os_mon] cpu supervisor port (cpu_sup): Erlang has closed > [os_mon] memory supervisor port (memsup): Erlang has closed > > Crash dump was written to: /var/log/riak/erl_crash.dump > eheap_alloc: Cannot allocate 2277966824 bytes of memory (of type "heap"). > > > > This email and attachments contain information that may be confidential or > privileged. If you are not the intended recipient, notify the sender at once > and delete this message completely from your information system. Further use, > disclosure, or copying of information contained in this email is not > authorized, and any such action should not be construed as a waiver of > privilege or other confidentiality protections. > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com