Hello Edgar, The consistent handoff behavior is normally indicative of a network issue which is resulting in frequent fallback vnodes starts. Based on your previous messages, you are handing off quite a few vnodes with 1 object so the vnodes are not long lived. Additionally, the most recent errors indicate a TCP recv timeout, further indicating some issue at the network layer.
I'd be happy to investigate this issue with you. Please attach a `riak-debug` output from this node and at least one other node in the cluster so we can track the issue down. Thanks, Brian On Fri, Feb 13, 2015 at 5:40 AM, Edgar Veiga <edgarmve...@gmail.com> wrote: > Hi again everyone! > > - The memory usage keeps growing day by day: > https://dl.dropboxusercontent.com/u/1962284/riak2.png > > - The handoffs keep on going, with strange things like a transfer started > 1.5 days ago: > riak-admin transfers > 'riak@192.168.20.112' waiting to handoff 51 partitions > 'riak@192.168.20.111' waiting to handoff 74 partitions > 'riak@192.168.20.110' waiting to handoff 86 partitions > 'riak@192.168.20.109' waiting to handoff 191 partitions > 'riak@192.168.20.108' waiting to handoff 67 partitions > 'riak@192.168.20.107' waiting to handoff 177 partitions > > transfer type: hinted_handoff > vnode type: riak_kv_vnode > partition: 51380916937414555718098294900181824909778878464 > started: 2015-02-11 21:54:07 [1.53 d ago] > last update: no updates seen > total size: unknown > objects transferred: unknown > > - I'm starting to have some entries in the error log: > 2015-02-12 19:58:54.026 [error] > <0.184.0>@riak_core_handoff_manager:handle_info:289 An outbound handoff of > partition riak_kv_vnode 936274486415109681974235595958868809467081785344 > was terminated for reason: noproc > 2015-02-12 20:27:34.092 [error] > <0.21096.1867>@riak_core_handoff_sender:start_fold:263 hinted_handoff > transfer of riak_kv_vnode from 'riak@192.168.20.112' > 1210306043414653979137426502093171875652569137152 to 'riak@192.168.20.109' > 1210306043414653979137426502093171875652569137152 failed because of TCP > recv timeout > 2015-02-12 20:27:34.092 [error] > <0.184.0>@riak_core_handoff_manager:handle_info:289 An outbound handoff of > partition riak_kv_vnode 1210306043414653979137426502093171875652569137152 > was terminated for reason: {shutdown,timeout} > 2015-02-12 21:25:32.852 [error] > <0.184.0>@riak_core_handoff_manager:handle_info:289 An outbound handoff of > partition riak_kv_vnode 742168800207099138150308704113737470919028244480 > was terminated for reason: noproc > > > Please, can anyone give me a help on this? I'm starting to get worried > with this behaviour. Tell me if you need more info! > > Thanks and Best regards, > Edgar Veiga > > On 10 February 2015 at 16:16, Edgar Veiga <edgarmve...@gmail.com> wrote: > >> Hi all! >> >> I have a riak cluster, working smoothly in production for about one year, >> with the following characteristics: >> >> - Version 1.4.12 >> >> - 6 nodes >> >> - leveldb backend >> >> - replication (n) = 3 >> >> ~ 3 billion keys >> >> ~ 1.2Tb per node >> >> - AAE disabled >> >> >> Two days ago I've upgraded all of the 6 nodes from riak v1.4.8 to v1.4.12, >> and two things started happening that are a little bit odd >> >> 1) The first is the memory consumption, please check the next imagem to >> understand what I mean: >> >> - https://dl.dropboxusercontent.com/u/1962284/riak.png >> >> 2) All of the machines keep logging hinted handoffs after the rolling >> restart. I've made the upgrade on non-busy hours and assured that the >> rolling restart was concluded only when all the in-progress handoffs were >> concluded, but on the next day when checking the logs I've realised that >> they keep appearing... Heres are some random examples: >> >> 2015-02-10 16:11:55.547 [info] >> <0.3070.753>@riak_core_handoff_sender:start_fold:148 Starting hinted_handoff >> transfer of riak_kv_vnode from 'riak@192.168.20.112' >> 765004763290394496247241279624929393101152190464 to 'riak@192.168.20.109' >> 765004763290394496247241279624929393101152190464 >> >> 2015-02-10 16:11:55.548 [info] >> <0.3070.753>@riak_core_handoff_sender:start_fold:236 hinted_handoff transfer >> of riak_kv_vnode from 'riak@192.168.20.112' >> 765004763290394496247241279624929393101152190464 to 'riak@192.168.20.109' >> 765004763290394496247241279624929393101152190464 completed: sent 3.15 KB >> bytes in 1 of 1 objects in 0.00 seconds (3.99 MB/second) >> >> 2015-02-10 16:12:05.803 [info] >> <0.3434.753>@riak_core_handoff_sender:start_fold:148 Starting hinted_handoff >> transfer of riak_kv_vnode from 'riak@192.168.20.112' >> 902020541790166644828836732692080926193895866368 to 'riak@192.168.20.109' >> 902020541790166644828836732692080926193895866368 >> >> 2015-02-10 16:12:05.856 [info] >> <0.3368.753>@riak_core_handoff_sender:start_fold:148 Starting hinted_handoff >> transfer of riak_kv_vnode from 'riak@192.168.20.112' >> 570899077082383952423314387779798054553098649600 to 'riak@192.168.20.111' >> 570899077082383952423314387779798054553098649600 >> >> 2015-02-10 16:12:05.860 [info] >> <0.3434.753>@riak_core_handoff_sender:start_fold:236 hinted_handoff transfer >> of riak_kv_vnode from 'riak@192.168.20.112' >> 902020541790166644828836732692080926193895866368 to 'riak@192.168.20.109' >> 902020541790166644828836732692080926193895866368 completed: sent 39.79 KB >> bytes in 1 of 1 objects in 0.06 seconds (699.32 KB/second) >> >> 2015-02-10 16:12:05.886 [info] >> <0.3368.753>@riak_core_handoff_sender:start_fold:236 hinted_handoff transfer >> of riak_kv_vnode from 'riak@192.168.20.112' >> 570899077082383952423314387779798054553098649600 to 'riak@192.168.20.111' >> 570899077082383952423314387779798054553098649600 completed: sent 3.55 KB >> bytes in 1 of 1 objects in 0.03 seconds (118.58 KB/second) >> >> >> Should I be worried or is this normal on this version? >> >> >> Best regards, >> >> Edgar >> >> > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com