Hi Sergey, This looks like the initial tcp connection is timing out when riak@de5 and riak@de6 are first trying to talk to the handoff/ip port for riak@de2 (which would be configured by in riak@de2's app.config).
You may have already gotten to the bottom of why that's happening, but the first thing to try would be to set the cluster handoff_concurrency limit to "0" and then back to the default to interrupt and restart any in-progress transfers and then watch the network traffic on the handoff ports. You can do this with "riak-admin transfer-limit" [1]. If you don't specify a "node" (as shown in the docs) it will set it for the whole cluster (which is what you want to do). Hope that helps. Keep us posted. Mark [1] http://docs.basho.com/riak/latest/ops/running/tools/riak-admin/#transfer-limit On Tue, Oct 22, 2013 at 2:27 AM, <fenix.ser...@gmail.com> wrote: > Hi all > > What to do in case of loss of the primary partitions !? > > 6 node cluster, leveldb, 1.3.2 > > 5-6 nodes always waiting to handoff 46 partitions > > 'riak@de6' waiting to handoff 46 partitions > 'riak@de5' waiting to handoff 46 partitions > > Active Transfers: > > transfer type: hinted_handoff > vnode type: riak_kv_vnode > partition: 919147514102638163401536164325474867830488825856 > started: 2013-10-22 08:18:35 [-131940361.00 us ago] > last update: no updates seen > objects transferred: unknown > > unknown > riak@de5 =======================> riak@de2 > unknown > > transfer type: hinted_handoff > vnode type: riak_kv_vnode > partition: 667951920186389224335277833702363723827125420032 > started: 2013-10-22 08:18:40 [-136954679.00 us ago] > last update: no updates seen > objects transferred: unknown > > unknown > riak@de6 =======================> riak@de2 > unknown > > ....... > > > How to fix/disable these handoffs and errors: > > 2013-10-21 23:59:56.894 [error] > <0.9317.693>@riak_core_handoff_sender:start_fold:226 hinted_handoff transfer > of riak_kv_vnode from 'riak@de5' > 987655403352524237692333890859050634376860663808 to 'riak@de2' > 987655403352524237692333890859050634376860663808 failed because of > error:{badmatch,{error,timeout}} > [{riak_core_handoff_sender,start_fold,5,[{file,"src/riak_core_handoff_sender.erl"},{line,101}]}] > .... > > Thanks, > Sergey > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com