On 2015-02-18 at 07:39, Kevin Wolf wrote:
Am 17.02.2015 um 22:33 hat Max Reitz geschrieben:
Concurrently modifying the bmap is not a good idea;
Why? I mean, the fact that this fixes something for you probably means
that there really is some piece of local state that is invalidated by
concurrent writes, but it's not obvious to me what it is.
One thing I can see is the following: The write operations for header
and bmap are separate. Therefore, writing the header may succeed; then,
when writing the bmap, something may yield and another instance of
vdi_co_write() gets to run. That instance is still in the while loop,
allocating a new block.
It modifies s->bmap[] and increases s->header.blocks_allocated; then
tries to write. That yields and the first coroutine wakes up. It writes
the bmap to the disk, which is now inconsistent with
s->header.blocks_allocated, though. So that looks to me like it may
break, although it in fact doesn't seem to be responsible for the
problem at hand.
For that problem, always doing the unlock() after the bdrv_write() in
the "if (!VDI_IS_ALLOCATED(bmap_entry))" path seems enough; although I
cannot really explain how coroutines yielding from writing the data
block interferes with other coroutines allocating new blocks.
However, in case you agree with my reasoning for the possible corruption
(which I did not try to produce), this patch (which would fix that)
would also happen to prevent the really appearing VDI problems, so...
Max
What could obviously happen is that metadata is written before the data
is on the disk, but as we don't support backing files for VDI, this is
irrelvant. (And if it were relevant, it stil wouldn't be fixed by your
patch because the driver never flushes.)
this patch adds a
lock for it. See https://bugs.launchpad.net/qemu/+bug/1422307 for what
can go wrong without.
Signed-off-by: Max Reitz <mre...@redhat.com>
Kevin