If what you are describing is happening for 1.4, type riak-admin diag
and see the new recommended kernel parameters, also, on vm.args
uncomment the +zdbbl 32768 parameter, since what you are describing is
similar to what happened to us when we upgraded to 1.4.
HTH,
Guido.
On 18/07/13 19:21, Simon Effenberg wrote:
Hi @list,
I see sometimes logs talking about "hinted_handoff transfer of .. failed because of
TCP recv timeout".
Also riak-admin transfers shows me many handoffs (is it possible to give some insights about
"how many" handoffs happened through "riak-admin status"?).
- Is it a normal behavior to have up to 30 handoffs from/to different nodes?
- How can I get down to the problem with the TCP recv timeout? I'm not sure if
this is a network problem or if the other node is too slow. The load is ok on
the machines (some IOwait but not 100%). Maybe interfering with AAE?
Here the log information about the TCP recv timeout. But that is not that often
but handoffs happens really often:
2013-07-18 16:22:05.654 UTC [error]
<0.28933.14>@riak_core_handoff_sender:start_fold:216 hinted_handoff transfer of
riak_kv_vnode from 'riak@10.46.109.207'
1118962191081472546749696200048404186924073353216 to 'riak@10.46.109.205'
1118962191081472546749696200048404186924073353216 failed because of TCP recv timeout
2013-07-18 16:22:05.673 UTC [error]
<0.202.0>@riak_core_handoff_manager:handle_info:282 An outbound handoff of
partition riak_kv_vnode 1118962191081472546749696200048404186924073353216 was
terminated for reason: {shutdown,timeout}
Thanks in advance
Simon
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com