Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-03 Thread Brandon Morris, PMP
import --file /root/32.10c.b.export On Thu, Jun 2, 2016 at 5:10 PM, Brad Hubbard wrote: > On Thu, Jun 2, 2016 at 9:07 AM, Brandon Morris, PMP > wrote: > > > The only way that I was able to get back to Health_OK was to > export/import. * Please note, any time you use the >

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Brandon Morris, PMP
abling directory > >> fragmentation. > >> -Greg > >> > >> On Wed, Jun 1, 2016 at 2:14 PM, Adam Tygart wrote: > >>> I've been attempting to work through this, finding the pgs that are > >>> causing hangs, determining if they a

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Brandon Morris, PMP
Adam, We ran into similar issues when we get too many objects in bucket (around 300 million). The .rgw.buckets.index pool became unable to complete backfill operations.The only way we were able to get past it was to export the offending placement group with the ceph-objectstore-tool and

Re: [ceph-users] Infernalis .rgw.buckets.index objects becoming corrupted in on RHEL 7.2 during recovery

2016-03-22 Thread Brandon Morris, PMP
t our specific deployment recipe? Thanks, Brandon On Thu, Mar 17, 2016 at 4:37 PM, Brandon Morris, PMP < brandon.morris@gmail.com> wrote: > List, > > We have stood up a Infernalis 9.2.0 cluster on RHEL 7.2. We are using the > radosGW to store potentially billions of small to me

[ceph-users] Infernalis .rgw.buckets.index objects becoming corrupted in on RHEL 7.2 during recovery

2016-03-19 Thread Brandon Morris, PMP
List, We have stood up a Infernalis 9.2.0 cluster on RHEL 7.2. We are using the radosGW to store potentially billions of small to medium sized objects (64k - 1MB). We have run into an issue twice thus far where .rgw.bucket.index placement groups will become corrupt during recovery after a drive