John Snow <js...@redhat.com> writes: > On 09/02/2016 01:44 AM, Markus Armbruster wrote: >> John Snow <js...@redhat.com> writes: >> >>> If a device still has an attached BDS because the medium has not yet >>> been removed, we will be unable to migrate to a new host because >>> blk_flush will return an error for that backend. >>> >>> Replace the call to blk_is_available to blk_is_inserted to weaken >>> the check and allow flushes from the backend to work, while still >>> disallowing flushes from the frontend/device model to work. >>> >>> This fixes a regression present in 2.6.0 caused by the following commit: >>> fe1a9cbc339bb54d20f1ca4c1e8788d16944d5cf >>> block: Move some bdrv_*_all() functions to BB >>> >>> Signed-off-by: John Snow <js...@redhat.com> >>> --- >>> block/block-backend.c | 6 +++--- >>> 1 file changed, 3 insertions(+), 3 deletions(-) >>> >>> diff --git a/block/block-backend.c b/block/block-backend.c >>> index effa038..36a32c3 100644 >>> --- a/block/block-backend.c >>> +++ b/block/block-backend.c >>> @@ -1058,7 +1058,7 @@ BlockAIOCB *blk_aio_pwritev(BlockBackend *blk, >>> int64_t offset, >>> BlockAIOCB *blk_aio_flush(BlockBackend *blk, >>> BlockCompletionFunc *cb, void *opaque) >>> { >>> - if (!blk_is_available(blk)) { >>> + if (!blk_is_inserted(blk)) { >>> return blk_abort_aio_request(blk, cb, opaque, -ENOMEDIUM); >>> } >>> >>> @@ -1118,7 +1118,7 @@ int blk_co_pdiscard(BlockBackend *blk, int64_t >>> offset, int count) >>> >>> int blk_co_flush(BlockBackend *blk) >>> { >>> - if (!blk_is_available(blk)) { >>> + if (!blk_is_inserted(blk)) { >>> return -ENOMEDIUM; >>> } >>> >>> @@ -1127,7 +1127,7 @@ int blk_co_flush(BlockBackend *blk) >>> >>> int blk_flush(BlockBackend *blk) >>> { >>> - if (!blk_is_available(blk)) { >>> + if (!blk_is_inserted(blk)) { >>> return -ENOMEDIUM; >>> } >> >> Naive question: should we flush before we open the tray? >> > > Naive, but long-winded answer: > > There are two types of flushes to me, conceptually: > > Frontend and Backend. > > Frontend would be a request from the quest to flush. If the medium in > question is absent, this should rightly fail. I do expect this to be > handled at the device layer. > > Backend is a request from QEMU for various reasons, like needing to > drain the queue for migration or savevm. > > What's happening here is that we have a backend request to flush a > medium that is inaccessible to the guest
Assuming we caught frontend requests at the device layer already. > -- The flush all routine is > ignorant of this fact. So we get a migration request with an open tray > (because the user had opened it some time prior perhaps, and left it > open unwittingly) and it fails because its inaccessible to the > guest. Migration fails as a result. > > That seems wrong to me. A few ways to fix it: > > (1) Have internal flush requests be aware of the fact that there's > nothing to flush here: this is a read-only media and we could skip it. Returning successfully because there's nothing to flush makes sense to me. > (2) Allow flush to fail in a non-fatal way for cases where we need to > flush, but cannot (-ENOMEDIUM) and where it would rightly fail on a > real machine. I don't like -ENOMEDIUM when there's nothing to flush. > (3) Just allow flushes to work on devices not visible to the guest, > which is what I've done here. Internal requests should work, Guest > requests should fail. I guess that's okay, ... > I was briefly concerned about "What if this lets us flush something we > shouldn't?" and Max's take was "That doesn't seem so bad. CDROMs are > RO anyway." ... even though I don't buy Max's reason. Writable removable media exist. The argument I can buy is that internal requests are fundamentally different from external requests. > So I went with the easiest option here. > > To answer your question more directly, we aren't choosing to > eject-then-flush, the user is. I can't move the flush before the eject > unless we flush-on-eject internally, then mark the device explicitly > as not needing to be flushed anymore, but that's essentially (1) up > there. > > --js