It's more than 30 handoffs sometimes: Attempting to restart script through sudo -H -u riak 'riak@10.47.109.209' waiting to handoff 6 partitions 'riak@10.47.109.208' waiting to handoff 2 partitions 'riak@10.47.109.207' waiting to handoff 1 partitions 'riak@10.47.109.206' waiting to handoff 14 partitions 'riak@10.47.109.205' waiting to handoff 12 partitions 'riak@10.47.109.204' waiting to handoff 14 partitions 'riak@10.47.109.203' waiting to handoff 16 partitions 'riak@10.47.109.202' waiting to handoff 3 partitions 'riak@10.47.109.201' waiting to handoff 3 partitions 'riak@10.46.109.209' waiting to handoff 4 partitions 'riak@10.46.109.208' waiting to handoff 1 partitions 'riak@10.46.109.207' waiting to handoff 4 partitions 'riak@10.46.109.206' waiting to handoff 12 partitions 'riak@10.46.109.205' waiting to handoff 12 partitions 'riak@10.46.109.204' waiting to handoff 13 partitions 'riak@10.46.109.203' waiting to handoff 12 partitions 'riak@10.46.109.202' waiting to handoff 17 partitions 'riak@10.46.109.201' waiting to handoff 12 partitions
On Thu, 18 Jul 2013 20:21:57 +0200 Simon Effenberg <seffenb...@team.mobile.de> wrote: > Hi @list, > > I see sometimes logs talking about "hinted_handoff transfer of .. failed > because of TCP recv timeout". > Also riak-admin transfers shows me many handoffs (is it possible to give some > insights about "how many" handoffs happened through "riak-admin status"?). > > - Is it a normal behavior to have up to 30 handoffs from/to different nodes? > - How can I get down to the problem with the TCP recv timeout? I'm not sure > if this is a network problem or if the other node is too slow. The load is ok > on the machines (some IOwait but not 100%). Maybe interfering with AAE? > > Here the log information about the TCP recv timeout. But that is not that > often but handoffs happens really often: > > 2013-07-18 16:22:05.654 UTC [error] > <0.28933.14>@riak_core_handoff_sender:start_fold:216 hinted_handoff transfer > of riak_kv_vnode from 'riak@10.46.109.207' > 1118962191081472546749696200048404186924073353216 to 'riak@10.46.109.205' > 1118962191081472546749696200048404186924073353216 failed because of TCP recv > timeout > 2013-07-18 16:22:05.673 UTC [error] > <0.202.0>@riak_core_handoff_manager:handle_info:282 An outbound handoff of > partition riak_kv_vnode 1118962191081472546749696200048404186924073353216 was > terminated for reason: {shutdown,timeout} > > > Thanks in advance > Simon > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com -- Simon Effenberg | Site Ops Engineer | mobile.international GmbH Fon: + 49-(0)30-8109 - 7173 Fax: + 49-(0)30-8109 - 7131 Mail: seffenb...@team.mobile.de Web: www.mobile.de Marktplatz 1 | 14532 Europarc Dreilinden | Germany Geschäftsführer: Malte Krüger HRB Nr.: 18517 P, Amtsgericht Potsdam Sitz der Gesellschaft: Kleinmachnow _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com