
  I have a 12 node riak cluster running riak 0.14.2.  I had several nodes
crash with OOM errors, and after restarting them I see the following when
running riak-admin transfers

Attempting to restart script through sudo -u riak
'riak@' waiting to handoff 1 partitions
'riak@' does not have 1 primary partitions running
'riak@' waiting to handoff 1 partitions
'riak@' does not have 1 primary partitions running
'riak@' waiting to handoff 1 partitions
'riak@' does not have 1 primary partitions running

The only errors in the whole cluster are 2 errors on, both of
the form

=ERROR REPORT==== 15-Feb-2012::17:49:38 ===
Handoff receiver for partition
exiting abnormally after processing 7 objects:

=ERROR REPORT==== 15-Feb-2012::17:49:41 ===
Handoff receiver for partition
exiting abnormally after processing 7 objects: 

I tried strobing through restarting all nodes, which seemed temporarily
fix this particular node, but then I think this error cropped up.

If there's anything I can try or more information I can give let me know.
The boxes are 16 core, 24 GB memory, with data in bitcask on an SSD drive,
there are 1024 partitions spread across 12 machines.  Each machine does
roughly 55-120K vnode gets per second, 20-40K node gets per second, 1-2K
 vnode puts, and 1-2K node puts.

Thanks for the help,


Anthony Molinaro                           <antho...@alumni.caltech.edu>

riak-users mailing list

Reply via email to