Re: [ceph-users] Fixing inconsistent placement groups

2014-06-16 Thread Markus Blank-Burian
okay, now i have copied the bigger pg including all the object files to the primary OSD. i still get errors during deep-scrub, for example: 2014-06-16T21:03:25+02:00 kaa-96 ceph-osd: 2014-06-16 21:03:25.954378 7f51e3fff700 0 log [ERR] : 0.7f1 shard 66 missing 9fdff7f1/11fa418.14ed/head//0

Re: [ceph-users] Fixing inconsistent placement groups

2014-06-16 Thread Markus Blank-Burian
> Shard...66? Really, that's what it says? Can you copy a few lines of the > output? Just posting the last few lines to give a good Impression of the missing-object-situation: 96 ceph-osd: 2014-06-16 10:45:20.300894 7f7c73ff7700 0 log [ERR] : 0.7f1 shard 1 missing babdf7f1/1212b52.0165/h

Re: [ceph-users] Fixing inconsistent placement groups

2014-06-16 Thread Aaron Ten Clay
On Mon, Jun 16, 2014 at 11:16 AM, Gregory Farnum wrote: > On Mon, Jun 16, 2014 at 11:11 AM, Aaron Ten Clay > wrote: > > I would also like to see Ceph get smarter about inconsistent PGs. If we > > can't automate the repair, at least the "ceph pg repair" command should > > figure out which copy is

Re: [ceph-users] Fixing inconsistent placement groups

2014-06-16 Thread Gregory Farnum
On Mon, Jun 16, 2014 at 11:11 AM, Aaron Ten Clay wrote: > I would also like to see Ceph get smarter about inconsistent PGs. If we > can't automate the repair, at least the "ceph pg repair" command should > figure out which copy is correct and use that, instead of overwriting all > OSDs with whatev

Re: [ceph-users] Fixing inconsistent placement groups

2014-06-16 Thread Aaron Ten Clay
I would also like to see Ceph get smarter about inconsistent PGs. If we can't automate the repair, at least the "ceph pg repair" command should figure out which copy is correct and use that, instead of overwriting all OSDs with whatever the primary has. Is it impossible to get the expected CRC out

Re: [ceph-users] Fixing inconsistent placement groups

2014-06-16 Thread Gregory Farnum
On Mon, Jun 16, 2014 at 7:13 AM, Markus Blank-Burian wrote: > I am also having inconsistent PGs (running ceph v0.80.1), where some > objects are missing. Excerpt from the logs (many similar lines): > "0.7f1 shard 66 missing a32857f1/1129786./head//0" Shard...66? Really, that's what it

Re: [ceph-users] Fixing inconsistent placement groups

2014-06-16 Thread Markus Blank-Burian
I am also having inconsistent PGs (running ceph v0.80.1), where some objects are missing. Excerpt from the logs (many similar lines): "0.7f1 shard 66 missing a32857f1/1129786./head//0" The primary PG and one copy only have 453MB data of the PG, but a third copy exists with 3.1GB data.

Re: [ceph-users] Fixing inconsistent placement groups

2014-06-12 Thread Gregory Farnum
The OSD should have logged the identities of the inconsistent objects to the central log on the monitors, as well as to its own local log file. You'll need to identify for yourself which version is correct, which will probably involve going and looking at them inside each OSD's data store. If the p

[ceph-users] Fixing inconsistent placement groups

2014-06-12 Thread Aaron Ten Clay
I'm having trouble finding a concise set of steps to repair inconsistent placement groups. I know from other threads that issuing a 'ceph pg repair ...' command could cause loss of data integrity if the primary OSD happens to have the bad copy of the placement group. I know how to find which PG's a