You are correct, even though the repair reports an error, I was able to join 
the disk back into the cluster, and it stopped reporting the legacy omap 
warning. I had assumed an "error" was something that needed to be rectified 
before anything could proceed, but apparently it's more like "warning: there 
was an error on this one non-critical task" :)


We'll probably just destroy and rebuild that OSD once we're back to HEALTH_OK.


Thank you!


________________________________
From: Igor Fedotov <ifedo...@suse.de>
Sent: Thursday, May 20, 2021 05:15
To: Pickett, Neale T; ceph-users@ceph.io
Subject: [EXTERNAL] [ceph-users] Re: fsck error: found stray omap data on 
omap_head

I think there is no way to fix that at the moment other than manually
identify and remove relevant record(s) in RocksDB with
ceph-kvstore-tool. Which might be pretty tricky..

Looks like we should implement these stray records removal when
repairing BlueStore...


On 5/19/2021 11:12 PM, Pickett, Neale T wrote:
> We just upgraded to pacific, and I'm trying to clear warnings about legacy 
> bluestore omap usage stats by running 'ceph-bluestore-tool repair`, as 
> instructed by the warning message. It's been going fine, but we are now 
> getting this error:
>
>
> [root@vanilla bin]# ceph-bluestore-tool repair --path $osd_path
> 2021-05-19T19:25:26.485+0000 7f67ca3593c0 -1 
> bluestore(/var/lib/ceph/osd/ceph-9) fsck error: found stray omap data on 
> omap_head 12256434 0 0
> repair status: remaining 1 error(s) and warning(s)
> [root@vanilla bin]# ceph-bluestore-tool fsck --path $osd_path -deep
>
> 2021-05-19T20:03:17.002+0000 7f4d1d6603c0 -1 
> bluestore(/var/lib/ceph/osd/ceph-9) fsck error: found stray omap data on 
> omap_head 12256434 0 0
>
> fsck status: remaining 1 error(s) and warning(s)
>
>
> We're only 10% of the way through our OSDs, so I'd like to find some way to 
> fix this other than destroying and rebuilding the OSD, in case it happens 
> again. Fixing this error is especially attractive since we can't get out of 
> HEALTH_WARN until we've run recover on all OSDs.

One can silent 'legacy omap' warning via
"bluestore_warn_on_no_per_pool_omap" and
"bluestore_warn_on_no_per_pg_omap" config parameterrs.

And I'm not sure I understand why the above fsck error prevents from
proceeding with the upgrade. Indeed the repair leaves this stray omap
record as-is but all the other omaps should be properly converted at
this point. I presume this should eliminate the "legacy omap" warning
for this specific OSD. Isn't this the case?


>
> Any suggestions?
>
>
>
> Neale Pickett <ne...@lanl.gov>
> A-4: Advanced Research in Cyber Systems
> Los Alamos National Laboratory
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to