Re: [ceph-users] fixing another remapped+incomplete EC 4+2 pg

2018-10-19 Thread Gregory Farnum
On Thu, Oct 18, 2018 at 2:28 PM Graham Allan wrote: > Thanks Greg, > > This did get resolved though I'm not 100% certain why! > > For one of the suspect shards which caused crash on backfill, I > attempted to delete the associated via s3, late last week. I then > examined the filestore OSDs and t

Re: [ceph-users] fixing another remapped+incomplete EC 4+2 pg

2018-10-18 Thread Graham Allan
Thanks Greg, This did get resolved though I'm not 100% certain why! For one of the suspect shards which caused crash on backfill, I attempted to delete the associated via s3, late last week. I then examined the filestore OSDs and the file shards were still present... maybe for an hour followi

Re: [ceph-users] fixing another remapped+incomplete EC 4+2 pg

2018-10-15 Thread Gregory Farnum
On Thu, Oct 11, 2018 at 3:22 PM Graham Allan wrote: > As the osd crash implies, setting "nobackfill" appears to let all the > osds keep running and the pg stays active and can apparently serve data. > > If I track down the object referenced below in the object store, I can > download it without e

Re: [ceph-users] fixing another remapped+incomplete EC 4+2 pg

2018-10-11 Thread Graham Allan
As the osd crash implies, setting "nobackfill" appears to let all the osds keep running and the pg stays active and can apparently serve data. If I track down the object referenced below in the object store, I can download it without error via s3... though as I can't generate a matching etag,

Re: [ceph-users] fixing another remapped+incomplete EC 4+2 pg

2018-10-09 Thread Graham Allan
On 10/09/2018 01:14 PM, Graham Allan wrote: On 10/9/2018 12:19 PM, Gregory Farnum wrote: I think unfortunately the easiest thing for you to fix this will be to set the min_size back to 4 until the PG is recovered (or at least has 5 shards done). This will be fixed in a later version of C

Re: [ceph-users] fixing another remapped+incomplete EC 4+2 pg

2018-10-09 Thread Graham Allan
On 10/9/2018 12:19 PM, Gregory Farnum wrote: On Wed, Oct 3, 2018 at 10:18 AM Graham Allan > wrote: However I have one pg which is stuck in state remapped+incomplete because it has only 4 out of 6 osds running, and I have been unable to bring the missing two bac

Re: [ceph-users] fixing another remapped+incomplete EC 4+2 pg

2018-10-09 Thread Gregory Farnum
On Wed, Oct 3, 2018 at 10:18 AM Graham Allan wrote: > Following on from my previous adventure with recovering pgs in the face > of failed OSDs, I now have my EC 4+2 pool oeprating with min_size=5 > which is as things should be. > > However I have one pg which is stuck in state remapped+incomplete

Re: [ceph-users] fixing another remapped+incomplete EC 4+2 pg

2018-10-09 Thread Graham Allan
Oops, by "periods" I do of course mean "intervals"...! On 10/8/2018 4:57 PM, Graham Allan wrote: I'm still trying to find a way to reactivate this one pg which is incomplete. There are a lot of periods in its history based on a combination of a peering storm a couple of weeks ago, with min_size

Re: [ceph-users] fixing another remapped+incomplete EC 4+2 pg

2018-10-08 Thread Graham Allan
I'm still trying to find a way to reactivate this one pg which is incomplete. There are a lot of periods in its history based on a combination of a peering storm a couple of weeks ago, with min_size being set too low for safety. At this point I think there is no chance of bringing back the full