Sounds like zdbbl.. I'm running 1.3.1 but it started after added 6 more
nodes to the previously 12 node cluster. So maybe it is because of a 18
node cluster?

I'll try the zdbbl stuff. Any other hint would be cool (if the new
kernel parameters are also good for 1.3.1.. could you provide them?).

Cheers
Simon

On Thu, 18 Jul 2013 19:34:18 +0100
Guido Medina <guido.med...@temetra.com> wrote:

> If what you are describing is happening for 1.4, type riak-admin diag 
> and see the new recommended kernel parameters, also, on vm.args 
> uncomment the +zdbbl 32768 parameter, since what you are describing is 
> similar to what happened to us when we upgraded to 1.4.
> 
> HTH,
> 
> Guido.
> 
> On 18/07/13 19:21, Simon Effenberg wrote:
> > Hi @list,
> >
> > I see sometimes logs talking about "hinted_handoff transfer of .. failed 
> > because of TCP recv timeout".
> > Also riak-admin transfers shows me many handoffs (is it possible to give 
> > some insights about "how many" handoffs happened through "riak-admin 
> > status"?).
> >
> > - Is it a normal behavior to have up to 30 handoffs from/to different nodes?
> > - How can I get down to the problem with the TCP recv timeout? I'm not sure 
> > if this is a network problem or if the other node is too slow. The load is 
> > ok on the machines (some IOwait but not 100%). Maybe interfering with AAE?
> >
> > Here the log information about the TCP recv timeout. But that is not that 
> > often but handoffs happens really often:
> >
> > 2013-07-18 16:22:05.654 UTC [error] 
> > <0.28933.14>@riak_core_handoff_sender:start_fold:216 hinted_handoff 
> > transfer of riak_kv_vnode from 'riak@10.46.109.207' 
> > 1118962191081472546749696200048404186924073353216 to 'riak@10.46.109.205' 
> > 1118962191081472546749696200048404186924073353216 failed because of TCP 
> > recv timeout
> > 2013-07-18 16:22:05.673 UTC [error] 
> > <0.202.0>@riak_core_handoff_manager:handle_info:282 An outbound handoff of 
> > partition riak_kv_vnode 1118962191081472546749696200048404186924073353216 
> > was terminated for reason: {shutdown,timeout}
> >
> >
> > Thanks in advance
> > Simon
> >
> > _______________________________________________
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


-- 
Simon Effenberg | Site Ops Engineer | mobile.international GmbH
Fon:     + 49-(0)30-8109 - 7173
Fax:     + 49-(0)30-8109 - 7131

Mail:     seffenb...@team.mobile.de
Web:    www.mobile.de

Marktplatz 1 | 14532 Europarc Dreilinden | Germany


Geschäftsführer: Malte Krüger
HRB Nr.: 18517 P, Amtsgericht Potsdam
Sitz der Gesellschaft: Kleinmachnow 

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to