Re: Upgraded riak 1.4.9 is pegging the CPU

2014-06-05 Thread Engel Sanchez
Alain, thanks for the logs you sent me on the side. I'm not yet sure what the root cause is, but I saw a lot of handoff activity and busy distributed port messages, which indicate the single TCP connection between two Erlang nodes is completely saturated. Since there is too much going on, turning

Re: Upgraded riak 1.4.9 is pegging the CPU

2014-06-05 Thread Engel Sanchez
Hi Alain. I don't think you are seeing the AAE issue. The problem with upgrading from 1.4.4-1.4.7 to 1.4.8 was a broken hash function in those, which made the AAE trees incompatible. You should not have the same problem in 1.4.0. It seems that Erlang processes are repeatedly crashing and restartin

Re: Upgraded riak 1.4.9 is pegging the CPU

2014-06-05 Thread Alain Rodriguez
Actually I just noticed it is likely the AAE issue: 2014-06-05 14:53:47.587 [error] <0.16054.31> CRASH REPORT Process <0.16054.31> with 0 neighbours exited with reason: no match of right hand value {error,{db_open,"IO error: lock /var/lib/riak/anti_entropy/10618722833732341515073647612704243814687

Re: Upgraded riak 1.4.9 is pegging the CPU

2014-06-05 Thread Alain Rodriguez
Thanks for the quick reply and no I did not. Is this something I should be able to do now (stop, remove files, start again) or is it too late? How could I verify this is the issue? On Thu, Jun 5, 2014 at 8:42 AM, Shane McEwan wrote: > On 05/06/14 16:20, Alain Rodriguez wrote: > > Hi all, > > >

Re: Upgraded riak 1.4.9 is pegging the CPU

2014-06-05 Thread Shane McEwan
On 05/06/14 16:20, Alain Rodriguez wrote: > Hi all, > > I upgraded 1 of 9 riak nodes in a cluster last night from 1.4.0 to > 1.4.9. The rest are running 1.4.0. > > Ever since I am seeing the upgraded node, riak01 consuming a > significantly larger percent of CPU and the PUT times on it have gotte

Upgraded riak 1.4.9 is pegging the CPU

2014-06-05 Thread Alain Rodriguez
Hi all, I upgraded 1 of 9 riak nodes in a cluster last night from 1.4.0 to 1.4.9. The rest are running 1.4.0. Ever since I am seeing the upgraded node, riak01 consuming a significantly larger percent of CPU and the PUT times on it have gotten worse. htop indicicates one particular process pegging