Hi Frank

Thanks for the reply.

> I think this happens when a PG has 3 different copies and cannot decide which 
> one is correct. You might have hit a very rare case. You should start with 
> the scrub errors, check which PGs and which copies (OSDs) are affected. It 
> sounds almost like all 3 scrub errors are on the same PG.
Yes, all 3 errors are for the same PG and on the same OSD:
2020-11-01 18:25:09.333339 osd.0 [ERR] 3.b shard 2 soid 
3:d577e975:::1000023675e.00000000:head : candidate had a missing snapset key, 
candidate had a missing info key
2020-11-01 18:25:09.333342 osd.0 [ERR] 3.b soid 
3:d577e975:::1000023675e.00000000:head : failed to pick suitable object info
2020-11-01 18:26:33.496255 osd.0 [ERR] 3.b repair 3 errors, 0 fixed

> You might have had a combination of crash and OSD fail, your situation is 
> probably not covered by "single point of failure".
Yes it was a complex crash, all went down.

> In case you have a PG with scrub errors on 2 copies, you should be able to 
> reconstruct the PG from the third with PG export/PG import commands.
I have not done a PG export/import before. Mind if you could send the 
instructions or a link for it.

Thanks
Sagara
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to