You are correct, even though the repair reports an error, I was able to join the disk back into the cluster, and it stopped reporting the legacy omap warning. I had assumed an "error" was something that needed to be rectified before anything could proceed, but apparently it's more like "warning: there was an error on this one non-critical task" :)
We'll probably just destroy and rebuild that OSD once we're back to HEALTH_OK. Thank you! ________________________________ From: Igor Fedotov <ifedo...@suse.de> Sent: Thursday, May 20, 2021 05:15 To: Pickett, Neale T; ceph-users@ceph.io Subject: [EXTERNAL] [ceph-users] Re: fsck error: found stray omap data on omap_head I think there is no way to fix that at the moment other than manually identify and remove relevant record(s) in RocksDB with ceph-kvstore-tool. Which might be pretty tricky.. Looks like we should implement these stray records removal when repairing BlueStore... On 5/19/2021 11:12 PM, Pickett, Neale T wrote: > We just upgraded to pacific, and I'm trying to clear warnings about legacy > bluestore omap usage stats by running 'ceph-bluestore-tool repair`, as > instructed by the warning message. It's been going fine, but we are now > getting this error: > > > [root@vanilla bin]# ceph-bluestore-tool repair --path $osd_path > 2021-05-19T19:25:26.485+0000 7f67ca3593c0 -1 > bluestore(/var/lib/ceph/osd/ceph-9) fsck error: found stray omap data on > omap_head 12256434 0 0 > repair status: remaining 1 error(s) and warning(s) > [root@vanilla bin]# ceph-bluestore-tool fsck --path $osd_path -deep > > 2021-05-19T20:03:17.002+0000 7f4d1d6603c0 -1 > bluestore(/var/lib/ceph/osd/ceph-9) fsck error: found stray omap data on > omap_head 12256434 0 0 > > fsck status: remaining 1 error(s) and warning(s) > > > We're only 10% of the way through our OSDs, so I'd like to find some way to > fix this other than destroying and rebuilding the OSD, in case it happens > again. Fixing this error is especially attractive since we can't get out of > HEALTH_WARN until we've run recover on all OSDs. One can silent 'legacy omap' warning via "bluestore_warn_on_no_per_pool_omap" and "bluestore_warn_on_no_per_pg_omap" config parameterrs. And I'm not sure I understand why the above fsck error prevents from proceeding with the upgrade. Indeed the repair leaves this stray omap record as-is but all the other omaps should be properly converted at this point. I presume this should eliminate the "legacy omap" warning for this specific OSD. Isn't this the case? > > Any suggestions? > > > > Neale Pickett <ne...@lanl.gov> > A-4: Advanced Research in Cyber Systems > Los Alamos National Laboratory > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io