Re: [ceph-users] Help with inconsistent pg on EC pool, v9.0.2

2015-08-28 Thread David Zafman
On 8/28/15 4:18 PM, Aaron Ten Clay wrote: How would I go about removing the bad PG with ceph-objectstore-tool? I'm having trouble finding any documentation for said tool. ceph_objectsore_tool —data-path /var/lib/ceph/osd/ceph-0 —journal-path /var/lib/ceph/osd/ceph-0/journal —pgid 2.36s1 —op r

Re: [ceph-users] Help with inconsistent pg on EC pool, v9.0.2

2015-08-28 Thread David Zafman
I don't know about removing the OSD from the CRUSH map. That seems like overkill to me. I just realized a possible better way. It would have been to take OSD down not out. Remove the ECs PG with the bad chunk. Bring it up again and let recovery repair just the single missing PG on the si

Re: [ceph-users] Help with inconsistent pg on EC pool, v9.0.2

2015-08-28 Thread Aaron Ten Clay
Thanks for the tip, David. I've marked osd.21 down and out and will wait for recovery. I've never had success manually manipulating the OSD contents - I assume I can achieve the same result by removing osd.21 from the CRUSH map, "ceph osd rm 21", then recreating it from scratch as though I'd lost a

Re: [ceph-users] Help with inconsistent pg on EC pool, v9.0.2

2015-08-28 Thread David Zafman
Without my latest branch which hasn't merged yet, you can't repair an EC pg in the situation that the shard with a bad checksum is in the first k chunks. A way to fix it would be to take that osd down/out and let recovery regenerate the chunk. Remove the pg from the osd (ceph-objectstore-t

Re: [ceph-users] Help with inconsistent pg on EC pool, v9.0.2

2015-08-28 Thread Samuel Just
David, does this look familiar? -Sam On Fri, Aug 28, 2015 at 10:43 AM, Aaron Ten Clay wrote: > Hi Cephers, > > I'm trying to resolve an inconsistent pg on an erasure-coded pool, running > Ceph 9.0.2. I can't seem to get Ceph to run a repair or even deep-scrub the > pg again. Here's the background

[ceph-users] Help with inconsistent pg on EC pool, v9.0.2

2015-08-28 Thread Aaron Ten Clay
Hi Cephers, I'm trying to resolve an inconsistent pg on an erasure-coded pool, running Ceph 9.0.2. I can't seem to get Ceph to run a repair or even deep-scrub the pg again. Here's the background, with my attempted resolution steps below. Hopefully someone can steer me in the right direction. Thank