Hello Everyone,
Two of our five nodes seeing the 100% GET/PUT times (node_[get | put]_fsm_time_100) increase to as high as 8 seconds, and looking at our available metrics we see huge amounts memory being used by Erlang processes (memory_processed_used). We normally see Erlang processes use tens of MBs, and occasionally a few hundred MBs for short periods. One node is now using between 5.2 GB to 18.5 GB. The other one is just little lower: 4 GB to 14 GB. Our average object size is roughly 25 KB. The logs on these two nodes have lots of: 2013-11-14 09:26:03.961 [info] <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor long_gc <0.19362.2932> [{initial_call,{riak_core_coverage_fsm,init,1}},{almost_current_function,{sms,sms,1}},{message_queue_len,41}] [{timeout,5356},{old_heap_block_size,0},{heap_block_size,870001580},{mbuf_size,0},{stack_size,54},{old_heap_size,0},{heap_size,336260414}] 2013-11-14 09:26:03.961 [info] <0.83.0>@riak_core_sysmon_handler:handle_event:92 monitor large_heap <0.19362.2932> [{initial_call,{riak_core_coverage_fsm,init,1}},{almost_current_function,{sms,sms,1}},{message_queue_len,41}] [{old_heap_block_size,0},{heap_block_size,870001580},{mbuf_size,0},{stack_size,54},{old_heap_size,0},{heap_size,336260414}] 2013-11-14 09:26:03.968 [error] <0.3205.3273> CRASH REPORT Process <0.3205.3273> with 0 neighbours crashed with reason: no function clause matching webmachine_request:peer_from_peername({error,enotconn}, {webmachine_request,{wm_reqstate,#Port<0.38822194>,[],undefined,undefined,undefined,{wm_reqdata,...},...}}) line 150 Anyone seen this before? -- Dave Brady
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com