Re: [ceph-users] Fixing inconsistent placement groups

Markus Blank-Burian Mon, 16 Jun 2014 07:13:42 -0700

I am also having inconsistent PGs (running ceph v0.80.1), where some
objects are missing. Excerpt from the logs (many similar lines):
"0.7f1 shard 66 missing a32857f1/10000129786.00000000/head//0"


The primary PG and one copy only have 453MB data of the PG, but a
third copy exists with 3.1GB data. The referenced objects (identified
by filename) are also present on another third OSD. First try: Move
"0.7f1_head" to a backup directory on both first and second OSD. This
resulted in a same 453MB copy with missing objects on the primary OSD.
Shouldn't all the data be copied automatically?

So i tried to copy the whole PG directory "0.7f1_head" from the third
OSD to the primary. This results the following assert:
2014-06-16T15:49:29+02:00 kaa-96 ceph-osd:     -2> 2014-06-16
15:49:29.046925 7f2e86b93780 10 osd.1 197813 pgid 0.7f1 coll
0.7f1_head
2014-06-16T15:49:29+02:00 kaa-96 ceph-osd:     -1> 2014-06-16
15:49:29.047033 7f2e86b93780 10 filestore(/local/ceph)
collection_getattr /local/ceph/current/0.7f1_head 'info' = -61
2014-06-16T15:49:29+02:00 kaa-96 ceph-osd:      0> 2014-06-16
15:49:29.048966 7f2e86b93780 -1 osd/PG.cc: In function 'static epoch_t
PG::peek_map_epoch(ObjectStore*, coll_t, hobject_t&,
ceph::bufferlist*)' thread 7f2e86b93780 time 2014-06-16
15:49:29.047045
osd/PG.cc: 2559: FAILED assert(r > 0)

 ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
 1: (PG::peek_map_epoch(ObjectStore*, coll_t, hobject_t&,
ceph::buffer::list*)+0x48d) [0x742a8b]
 2: (OSD::load_pgs()+0xda3) [0x64c419]
 3: (OSD::init()+0x780) [0x64e9ce]
 4: (main()+0x25d9) [0x602cbf]

Am i missing something? And wouldn't it be relatively easy to
implement an option to "pg repair" to choose a backup OSD as source
instead of the primary OSD?

It is still unclear, where these inconsistencies (i.e. missing objects
/ empty directories) result from, see also:
http://tracker.ceph.com/issues/8532.

On Fri, Jun 13, 2014 at 4:58 AM, Gregory Farnum <g...@inktank.com> wrote:
> The OSD should have logged the identities of the inconsistent objects
> to the central log on the monitors, as well as to its own local log
> file. You'll need to identify for yourself which version is correct,
> which will probably involve going and looking at them inside each
> OSD's data store. If the primary is correct for all the objects in a
> PG, you can just run repair; otherwise you'll want to copy the
> replica's copy to the primary. Sorry. :/
> (If you have no way of checking yourself which is correct, and you
> have more than 2 replicas, you can compare the stored copies and just
> take the one held by the majority — that's probably correct.)
> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
>
>
> On Thu, Jun 12, 2014 at 7:27 PM, Aaron Ten Clay <aaro...@aarontc.com> wrote:
>> I'm having trouble finding a concise set of steps to repair inconsistent
>> placement groups. I know from other threads that issuing a 'ceph pg repair
>> ...' command could cause loss of data integrity if the primary OSD happens
>> to have the bad copy of the placement group. I know how to find which PG's
>> are bad (ceph pg dump), but I'm not sure how to figure out which objects in
>> the PG failed their CRCs during the deep scrub, and I'm not sure how to get
>> the correct CRC so I can determine which OSD holds the correct copy.
>>
>> Maybe I'm on the wrong path entirely? If someone knows how to resolve this,
>> I'd appreciate some insight. I think this would be a good topic for adding
>> to the OSD/PG operations section of the manual, or at least a wiki article.
>>
>> Thanks!
>> -Aaron
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Fixing inconsistent placement groups

Reply via email to