On Tue, 04/26 01:55, Laszlo Ersek wrote: > On 04/15/16 05:27, Fam Zheng wrote: > > Block drivers can implement this new operation .bdrv_lockf to actually lock > > the > > image in the protocol specific way. > > > > Signed-off-by: Fam Zheng <f...@redhat.com> > > --- > > block.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > > include/block/block_int.h | 12 ++++++++++++ > > 2 files changed, 54 insertions(+) > > > > diff --git a/block.c b/block.c > > index 1c575e4..7971a25 100644 > > --- a/block.c > > +++ b/block.c > > @@ -846,6 +846,34 @@ out: > > g_free(gen_node_name); > > } > > > > +static int bdrv_lock_unlock_image_do(BlockDriverState *bs, bool lock_image) > > +{ > > + int cmd = BDRV_LOCKF_UNLOCK; > > + > > + if (bs->image_locked == lock_image) { > > + return 0; > > + } else if (!bs->drv) { > > + return -ENOMEDIUM; > > + } else if (!bs->drv->bdrv_lockf) { > > + return 0; > > + } > > + if (lock_image) { > > + cmd = bs->open_flags & BDRV_O_RDWR ? BDRV_LOCKF_RWLOCK : > > + BDRV_LOCKF_ROLOCK; > > + } > > + return bs->drv->bdrv_lockf(bs, cmd); > > +} > > + > > +static int bdrv_lock_image(BlockDriverState *bs) > > +{ > > + return bdrv_lock_unlock_image_do(bs, true); > > +} > > + > > +static int bdrv_unlock_image(BlockDriverState *bs) > > +{ > > + return bdrv_lock_unlock_image_do(bs, false); > > +} > > + > > static QemuOptsList bdrv_runtime_opts = { > > .name = "bdrv_common", > > .head = QTAILQ_HEAD_INITIALIZER(bdrv_runtime_opts.head), > > @@ -995,6 +1023,14 @@ static int bdrv_open_common(BlockDriverState *bs, > > BdrvChild *file, > > goto free_and_fail; > > } > > > > + if (!(open_flags & (BDRV_O_NO_LOCK | BDRV_O_INACTIVE))) { > > + ret = bdrv_lock_image(bs); > > + if (ret) { > > + error_setg(errp, "Failed to lock image"); > > + goto free_and_fail; > > + } > > + } > > + > > ret = refresh_total_sectors(bs, bs->total_sectors); > > if (ret < 0) { > > error_setg_errno(errp, -ret, "Could not refresh total sector > > count"); > > @@ -2144,6 +2180,7 @@ static void bdrv_close(BlockDriverState *bs) > > if (bs->drv) { > > BdrvChild *child, *next; > > > > + bdrv_unlock_image(bs); > > bs->drv->bdrv_close(bs); > > bs->drv = NULL; > > > > @@ -3230,6 +3267,9 @@ void bdrv_invalidate_cache(BlockDriverState *bs, > > Error **errp) > > error_setg_errno(errp, -ret, "Could not refresh total sector > > count"); > > return; > > } > > + if (!(bs->open_flags & BDRV_O_NO_LOCK)) { > > + bdrv_lock_image(bs); > > + } > > } > > > > void bdrv_invalidate_cache_all(Error **errp) > > @@ -3262,6 +3302,7 @@ static int bdrv_inactivate(BlockDriverState *bs) > > } > > > > bs->open_flags |= BDRV_O_INACTIVE; > > + ret = bdrv_unlock_image(bs); > > return 0; > > } > > > > @@ -3981,3 +4022,4 @@ void bdrv_refresh_filename(BlockDriverState *bs) > > QDECREF(json); > > } > > } > > + > > diff --git a/include/block/block_int.h b/include/block/block_int.h > > index 10d8759..ffa30b0 100644 > > --- a/include/block/block_int.h > > +++ b/include/block/block_int.h > > @@ -85,6 +85,12 @@ typedef struct BdrvTrackedRequest { > > struct BdrvTrackedRequest *waiting_for; > > } BdrvTrackedRequest; > > > > +typedef enum { > > + BDRV_LOCKF_RWLOCK, > > + BDRV_LOCKF_ROLOCK, > > + BDRV_LOCKF_UNLOCK, > > +} BdrvLockfCmd; > > + > > struct BlockDriver { > > const char *format_name; > > int instance_size; > > @@ -317,6 +323,11 @@ struct BlockDriver { > > */ > > void (*bdrv_drain)(BlockDriverState *bs); > > > > + /** > > + * Lock/unlock the image. > > + */ > > + int (*bdrv_lockf)(BlockDriverState *bs, BdrvLockfCmd cmd); > > + > > QLIST_ENTRY(BlockDriver) list; > > }; > > > > @@ -485,6 +496,7 @@ struct BlockDriverState { > > NotifierWithReturn write_threshold_notifier; > > > > int quiesce_counter; > > + bool image_locked; > > }; > > > > struct BlockBackendRootState { > > > > I'd like to raise one point which I think may not have been, yet (after > briefly skimming the v1 / v2 comments). Sorry if this has been discussed > already. > > IIUC, the idea is that "protocols" (in the block layer sense) implement > the lockf method, and then bdrv_open_common() automatically locks image > files, if the lockf method is available, and if various settings > (cmdline options etc) don't request otherwise. > > I tried to see if this series modifies -- for example -- > raw_reopen_commit() and raw_reopen_abort(), in "block/raw-posix.c". Or, > if it modifies bdrv_reopen_multiple(), in "block.c". It doesn't seem to. > > Those functions are relevant for the following reason. Given the > following chain of references: > > file descriptor --> file description --> file > > an fcntl() lock is associated with the file. However, the fcntl() lock > held by the process on the file is dropped if the process closes *any* > file descriptor that points (through the same or another file > description) to the file. From > <http://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html>: > > All locks associated with a file for a given process shall be > removed when a file descriptor for that file is closed by that > process [...] > > From <http://pubs.opengroup.org/onlinepubs/9699919799/functions/close.html>: > > All outstanding record locks owned by the process on the file > associated with the file descriptor shall be removed (that is, > unlocked). > > From <http://man7.org/linux/man-pages/man2/fcntl.2.html>: > > If a process closes any file descriptor referring to a file, then > all of the process's locks on that file are released, regardless of > the file descriptor(s) on which the locks were obtained. > > The bdrv_reopen_multiple() function reopens a bunch of image files. > Backed by the raw-posix protocol driver, this seems to boil down to a > series of (i) fcntl(F_DUPFD_CLOEXEC), and/or (ii) dup(), and/or (iii) > qemu_open() calls, in raw_reopen_prepare(). The result is stored in > "raw_s->fd" every time. > > (In the first two cases, the file description will be shared, in the > third case, the file will be shared, between "s->fd" and "raw_s->fd".) > > Assume that one of the raw_reopen_prepare() calls fails. Then > bdrv_reopen_multiple() will roll back the work done thus far, calling > raw_reopen_abort() on the initial subset of image files. This results in > "raw_s->fd" being passed to close(), which is when the lock > (conceptually held for "s->fd") is dropped for good. > > If all of the raw_reopen_prepare() calls succeed, then a series of > raw_reopen_commit() calls will occur. That has the same effect: "s->fd" > is passed to close(), which drops the lock for "raw_s->fd" too (which is > supposed to be used for accessing the file, going forward). >
Yes, this is a good catch! I'll take care of it in next version, and add some tests. Thanks! Fam