Re: [Qemu-devel] [PATCH] blockdev: reset werror/rerror on drive_del

Stefan Hajnoczi Wed, 05 Jun 2013 01:23:33 -0700

On Tue, Jun 04, 2013 at 06:37:27PM +0200, Markus Armbruster wrote:
> Stefan Hajnoczi <stefa...@redhat.com> writes:
> 
> > Paolo Bonzini <pbonz...@redhat.com> suggested the following test case:
> >
> > 1. Launch a guest and wait at the GRUB boot menu:
> >
> >   qemu-system-x86_64 -enable-kvm -m 1024 \
> >    -drive if=none,cache=none,file=test.img,id=foo,werror=stop,rerror=stop
> >    -device virtio-blk-pci,drive=foo,id=virtio0,addr=4
> >
> > 2. Hot unplug the device:
> >
> >   (qemu) drive_del foo
> >
> > 3. Select the first boot menu entry
> >
> > Without this patch the guest pauses due to ENOMEDIUM.  But it is not
> > possible to resolve this situation - the drive has become anonymous.
> >
> > With this patch the guest the guest gets the ENOMEDIUM error.
> >
> > Note that this scenario actually happens sometimes during libvirt disk
> > hot unplug, where device_del is followed by drive_del.  I/O may still be
> > submitted to the drive after drive_del if the guest does not process the
> > PCI hot unplug notification.
> >
> > Reported-by: Dafna Ron <d...@redhat.com>
> > Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com>
> > ---
> >  blockdev.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/blockdev.c b/blockdev.c
> > index d1ec99a..6eb81a3 100644
> > --- a/blockdev.c
> > +++ b/blockdev.c
> > @@ -1180,6 +1180,10 @@ int do_drive_del(Monitor *mon, const QDict *qdict, 
> > QObject **ret_data)
> >       */
> >      if (bdrv_get_attached_dev(bs)) {
> >          bdrv_make_anon(bs);
> > +
> > +        /* Further I/O must not pause the guest */
> > +        bdrv_set_on_error(bs, BLOCKDEV_ON_ERROR_REPORT,
> > +                          BLOCKDEV_ON_ERROR_REPORT);
> >      } else {
> >          drive_uninit(drive_get_by_blockdev(bs));
> >      }
> 
> The user gets exactly what he ordered.  He ordered "stop on error", then
> provoked errors by turning the virtual block device into a virtual pile
> of scrap metal.  Because that's exactly what drive_del does when used
> while a device model is attached to the drive.
> 
> The only sane use case for drive_del I can think of is revoking access
> to an image violently, after the guest failed to honor a hot unplug.
> 
> Even then, using drive_del when the block device is removable is
> unnecessary.  Just rip out the medium with eject -f.  Look ma, no scrap
> metal.
> 
> I'm not sure what you mean by "it is not possible to resolve this
> situation".  The device is shot!  Can't see how that could be resolved.


This is the critical part: the guest is paused and there is no way to
resolve the continuous pause loop.  The drive is gone but the guest
hasn't PCI hot unplugged the storage controller.  As a user, there's
nothing you can do on the QEMU monitor to resume the guest - it will
just pause itself again.

This behavior is really bad, QEMU has basically wedged the guest into an
unrecoverable state and that's what I was trying to describe.

> I figure the bit that can't be resolved now is letting the user switch
> off "stop on error" safely before a drive_del.  Even if we had a command
> for that, there'd still be a window between that command's execution and
> drive_del's.  Your patch solves the problem by having drive_del switch
> it off unconditionally.  Oookay, but please document it, because it's
> not exactly obvious.

Thanks for the documentation suggestion, will add it in v2.

> Re "the guest gets the ENOMEDIUM error": depends on the device.  I doubt
> disks can signal "no medium", and even if they could, I doubt device
> drivers are prepared for it.

Yep, error reporting depends on the emulated storage controller.
virtio-blk and IDE just report a generic error status.

> Re "this scenario actually happens sometimes during libvirt disk hot
> unplug, where device_del is followed by drive_del": if I remember
> correctly, libvirt disk hot unplug runs drive_del right after
> device_del, opening a window where the guest sees a dead device.  That's
> asking for trouble, and trouble is known to oblige.

Agreed.

Re: [Qemu-devel] [PATCH] blockdev: reset werror/rerror on drive_del

Reply via email to