Could you elaborate about what constitutes deleting the PG in this instance, is 
a simple `rm` of the directories with the PG number in current sufficient? or 
does it need some poking of anything else?

It is conceivable that there is a fault with the disks, they are known to be 
‘faulty’ in the general sense that they suffer a cliff-edge Perf issue, however 
I’m somewhat confused about why this would suddenly happen in the way it has 
been detected.

We are past early life failures, most of these disks don’t appear to have any 
significant issues in their smart data to indicate that any write failures are 
occurring, and I haven’t seen this error once until a couple of weeks ago 
(we’ve been operating this cluster over 2 years now).

The only versions I’m seeing running (just double checked) currently are 
10.2.5,6 and 7. There was one node that had hammer running on it a while back, 
but it’s been running jewel for months now, so I doubt it’s related to that.



> On 26 May 2017, at 00:22, Gregory Farnum <gfar...@redhat.com> wrote:
> 
>  

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to