Thanks to everyone that replied so far.
During my debugging, I discovered that either an 'object-map rebuild' or
an 'object-map check' is sufficient to clear the conditions that are
preventing the volume from being used properly.
Is there something in common that the two command both do that could be
affecting the volume? ie clearing locks, state, flush cache, etc.
I checked for locks, watchers, etc when the volume was not usable, but
nothing evident.
Cheers,
Gary
On 2025-06-26 9:33 a.m., Ilya Dryomov wrote:
On Tue, Jun 24, 2025 at 11:19 PM Gary Molenkamp <molen...@uwo.ca> wrote:
We use ceph rbd as a volume service for both an Openstack deployment and
a series of Proxmox servers. This ceph deployment started as a Hammer
release and has been upgraded over the years to where it is now running
Quincy. It has been fairly solid over that time, even
through upgrades from filestore to bluestore, and many transparent
hardware replacements/improvements.
One concern we have is that when we have a hypervisor that unexpectedly
dies/crashes, the volumes must always have the object maps rebuilt. If
we don't rebuild the object maps, the VMs will either not boot, or we
will have other side-effects that render the volume unusable. (ie cannot
mount root). Is this to be expected during this type of event or have
I missed a setting during one of the many upgrade on our deployment?
Hi Gary,
It's definitely not expected. Have you ever run "rbd object-map check"
command and captured its output before rebuilding the object map? Some
object map inconsistencies following a hard crash are expected, but they
shouldn't be leading to the VM not booting/rootfs not mounting.
Thanks,
Ilya
--
Gary Molenkamp Science Technology Services
Systems Engineer University of Western Ontario
molen...@uwo.ca http://sts.sci.uwo.ca
(519) 661-2111 x86882 (519) 661-3566
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io