Hi Luke,
Thanks for responding! I've been unavailable most of the day, hence my late reply. I'll gather up the those logs (tomorrow). No queries are running, and no one has tried to getting the a key list. We restarted our programs to clear any connections they had to the slow nodes after disabling those nodes in haproxy. One node has slowly, over the last five hours, started to head back to normal. Peak usage is down to 5.2 GB. The other node has gotten even worse. It's now ranging from 6.5 GB to 23 GB. -- Dave Brady ----- Original Message ----- From: "Luke Bakken" <lbak...@basho.com> To: "Dave Brady" <dbr...@weborama.com> Cc: "riak-users" <riak-users@lists.basho.com> Sent: Jeudi 14 Novembre 2013 18:14:43 Subject: Re: Degraded response times with massive increase in Erlang VM process memory use Hi Dave, A few people have chimed in to ask what kinds of queries are running / have been run recently against this cluster - Map/Reduce, list keys, 2i? -- Luke Bakken CSE lbak...@basho.com On Thu, Nov 14, 2013 at 1:56 AM, Dave Brady < dbr...@weborama.com > wrote: Hello Everyone, Two of our five nodes seeing the 100% GET/PUT times (node_[get | put]_fsm_time_100) increase to as high as 8 seconds, and looking at our available metrics we see huge amounts memory being used by Erlang processes (memory_processed_used). We normally see Erlang processes use tens of MBs, and occasionally a few hundred MBs for short periods. One node is now using between 5.2 GB to 18.5 GB. The other one is just little lower: 4 GB to 14 GB. Our average object size is roughly 25 KB. The logs on these two nodes have lots of: 2013-11-14 09:26:03.961 [info] <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc <0.19362.2932> [{initial_call,{riak_core_coverage_fsm,init,1}},{almost_current_function,{sms,sms,1}},{message_queue_len,41}] [{timeout,5356},{old_heap_block_size,0},{heap_block_size,870001580},{mbuf_size,0},{stack_size,54},{old_heap_size,0},{heap_size,336260414}] 2013-11-14 09:26:03.961 [info] <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.19362.2932> [{initial_call,{riak_core_coverage_fsm,init,1}},{almost_current_function,{sms,sms,1}},{message_queue_len,41}] [{old_heap_block_size,0},{heap_block_size,870001580},{mbuf_size,0},{stack_size,54},{old_heap_size,0},{heap_size,336260414}] 2013-11-14 09:26:03.968 [error] <0.3205.3273> CRASH REPORT Process <0.3205.3273> with 0 neighbours crashed with reason: no function clause matching webmachine_request:peer_from_peername({error,enotconn}, {webmachine_request,{wm_reqstate,#Port<0.38822194>,[],undefined,undefined,undefined,{wm_reqdata,...},...}}) line 150 Anyone seen this before? -- Dave Brady _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com