Re: [PATCH v4 11/22] hw/virtio: move vhd->started check into helper and add FIXME
On Tue, Aug 02, 2022 at 10:49:59AM +0100, Alex Bennée wrote: > The `started` field is manipulated internally within the vhost code > except for one place, vhost-user-blk via f5b22d06fb (vhost: recheck > dev state in the vhost_migration_log routine). Mark that as a FIXME > because it introduces a potential race. I think the referenced fix > should be tracking its state locally. I don't think we can track the state locally. As described in the commit message for f5b22d06fb, the state is used by vhost code in the vhost_migration_log() function so we probably need something at the vhost level. I do agree we shouldn't re-use vdev->started. Maybe we should add another 'active' variable in vhost_dev? I'm happy to send a patch for that. Until we agree on a better solution I'm happy with the FIXME. Reviewed-by: Raphael Norwitz > > Signed-off-by: Alex Bennée > --- > include/hw/virtio/vhost.h | 12 > hw/block/vhost-user-blk.c | 10 -- > hw/scsi/vhost-scsi.c | 4 ++-- > hw/scsi/vhost-user-scsi.c | 2 +- > hw/virtio/vhost-user-fs.c | 3 ++- > hw/virtio/vhost-user-i2c.c | 4 ++-- > hw/virtio/vhost-user-rng.c | 4 ++-- > hw/virtio/vhost-user-vsock.c | 2 +- > hw/virtio/vhost-vsock-common.c | 3 ++- > hw/virtio/vhost-vsock.c| 2 +- > 10 files changed, 33 insertions(+), 13 deletions(-) > > diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h > index 586c5457e2..61b957e927 100644 > --- a/include/hw/virtio/vhost.h > +++ b/include/hw/virtio/vhost.h > @@ -94,6 +94,7 @@ struct vhost_dev { > uint64_t protocol_features; > uint64_t max_queues; > uint64_t backend_cap; > +/* @started: is the vhost device started? */ > bool started; > bool log_enabled; > uint64_t log_size; > @@ -165,6 +166,17 @@ int vhost_dev_enable_notifiers(struct vhost_dev *hdev, > VirtIODevice *vdev); > */ > void vhost_dev_disable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev); > > +/** > + * vhost_dev_is_started() - report status of vhost device > + * @hdev: common vhost_dev structure > + * > + * Return the started status of the vhost device > + */ > +static inline bool vhost_dev_is_started(struct vhost_dev *hdev) > +{ > +return hdev->started; > +} > + > /** > * vhost_dev_start() - start the vhost device > * @hdev: common vhost_dev structure > diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c > index 9117222456..2bba42478d 100644 > --- a/hw/block/vhost-user-blk.c > +++ b/hw/block/vhost-user-blk.c > @@ -229,7 +229,7 @@ static void vhost_user_blk_set_status(VirtIODevice *vdev, > uint8_t status) > return; > } > > -if (s->dev.started == should_start) { > +if (vhost_dev_is_started(&s->dev) == should_start) { > return; > } > > @@ -286,7 +286,7 @@ static void vhost_user_blk_handle_output(VirtIODevice > *vdev, VirtQueue *vq) > return; > } > > -if (s->dev.started) { > +if (vhost_dev_is_started(&s->dev)) { > return; > } > > @@ -415,6 +415,12 @@ static void vhost_user_blk_event(void *opaque, > QEMUChrEvent event) > * the vhost migration code. If disconnect was caught there is an > * option for the general vhost code to get the dev state without > * knowing its type (in this case vhost-user). > + * > + * FIXME: this is sketchy to be reaching into vhost_dev > + * now because we are forcing something that implies we > + * have executed vhost_dev_stop() but that won't happen > + * until vhost_user_blk_stop() gets called from the bh. > + * Really this state check should be tracked locally. > */ > s->dev.started = false; > } > diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c > index 3059068175..bdf337a7a2 100644 > --- a/hw/scsi/vhost-scsi.c > +++ b/hw/scsi/vhost-scsi.c > @@ -120,7 +120,7 @@ static void vhost_scsi_set_status(VirtIODevice *vdev, > uint8_t val) > start = false; > } > > -if (vsc->dev.started == start) { > +if (vhost_dev_is_started(&vsc->dev) == start) { > return; > } > > @@ -147,7 +147,7 @@ static int vhost_scsi_pre_save(void *opaque) > > /* At this point, backend must be stopped, otherwise > * it might keep writing to memory. */ > -assert(!vsc->dev.started); > +assert(!vhost_dev_is_started(&vsc->dev)); > > return 0; > } > diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c > index 1b2f7eed98..bc37317d55 100644 > --- a/hw/scsi/vhost-user-scsi.c > +++ b/hw/scsi/vhost-user-scsi.c > @@ -49,7 +49,7 @@ static void vhost_user_scsi_set_status(VirtIODevice *vdev, > uint8_t status) > VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s); > bool start = (status & VIRTIO_CONFIG_S_DRIVER_OK) && vdev->vm_running; > > -if (vsc->dev.started == start) { > +if (vhost_dev_is_started
Re: [PATCH v6 4/8] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls
Sam Li 于2022年8月5日周五 15:58写道: > > By adding zone management operations in BlockDriver, storage controller > emulation can use the new block layer APIs including Report Zone and > four zone management operations (open, close, finish, reset). > > Add zoned storage commands of the device: zone_report(zrp), zone_open(zo), > zone_close(zc), zone_reset(zrs), zone_finish(zf). > > For example, to test zone_report, use following command: > $ ./build/qemu-io --image-opts driver=zoned_host_device, filename=/dev/nullb0 > -c "zrp offset nr_zones" > > Signed-off-by: Sam Li > Reviewed-by: Hannes Reinecke > --- > block/block-backend.c| 50 + > block/coroutines.h | 6 + > block/file-posix.c | 315 ++- > block/io.c | 41 > include/block/block-common.h | 1 - > include/block/block-io.h | 13 ++ > include/block/block_int-common.h | 22 ++- > include/block/raw-aio.h | 6 +- > meson.build | 1 + > qapi/block-core.json | 8 +- > qemu-io-cmds.c | 144 ++ > 11 files changed, 601 insertions(+), 6 deletions(-) > > diff --git a/block/block-backend.c b/block/block-backend.c > index d4a5df2ac2..fc639b0cd7 100644 > --- a/block/block-backend.c > +++ b/block/block-backend.c > @@ -1775,6 +1775,56 @@ int coroutine_fn blk_co_flush(BlockBackend *blk) > return ret; > } > > +/* > + * Send a zone_report command. > + * offset is a byte offset from the start of the device. No alignment > + * required for offset. > + * nr_zones represents IN maximum and OUT actual. > + */ > +int coroutine_fn blk_co_zone_report(BlockBackend *blk, int64_t offset, > +unsigned int *nr_zones, > +BlockZoneDescriptor *zones) > +{ > +int ret; > +IO_CODE(); > + > +blk_inc_in_flight(blk); /* increase before waiting */ > +blk_wait_while_drained(blk); > +if (!blk_is_available(blk)) { > +blk_dec_in_flight(blk); > +return -ENOMEDIUM; > +} > +ret = bdrv_co_zone_report(blk_bs(blk), offset, nr_zones, zones); > +blk_dec_in_flight(blk); > +return ret; > +} > + > +/* > + * Send a zone_management command. > + * offset is the starting zone specified as a sector offset. > + * len is the maximum number of sectors the command should operate on. > + */ > +int coroutine_fn blk_co_zone_mgmt(BlockBackend *blk, BlockZoneOp op, > +int64_t offset, int64_t len) > +{ > +int ret; > +IO_CODE(); > + > +ret = blk_check_byte_request(blk, offset, len); > +if (ret < 0) { > +return ret; > +} > +blk_inc_in_flight(blk); > +blk_wait_while_drained(blk); > +if (!blk_is_available(blk)) { > +blk_dec_in_flight(blk); > +return -ENOMEDIUM; > +} > +ret = bdrv_co_zone_mgmt(blk_bs(blk), op, offset, len); > +blk_dec_in_flight(blk); > +return ret; > +} > + > void blk_drain(BlockBackend *blk) > { > BlockDriverState *bs = blk_bs(blk); > diff --git a/block/coroutines.h b/block/coroutines.h > index 3a2bad564f..e3f62d94e5 100644 > --- a/block/coroutines.h > +++ b/block/coroutines.h > @@ -63,6 +63,12 @@ nbd_co_do_establish_connection(BlockDriverState *bs, bool > blocking, > Error **errp); > > > +int coroutine_fn blk_co_zone_report(BlockBackend *blk, int64_t offset, > +unsigned int *nr_zones, > +BlockZoneDescriptor *zones); > +int coroutine_fn blk_co_zone_mgmt(BlockBackend *blk, BlockZoneOp op, > + int64_t offset, int64_t len); > + > /* > * "I/O or GS" API functions. These functions can run without > * the BQL, but only in one specific iothread/main loop. > diff --git a/block/file-posix.c b/block/file-posix.c > index 4785203eea..2627431581 100644 > --- a/block/file-posix.c > +++ b/block/file-posix.c > @@ -67,6 +67,9 @@ > #include > #include > #include > +#if defined(CONFIG_BLKZONED) > +#include > +#endif > #include > #include > #include > @@ -216,6 +219,13 @@ typedef struct RawPosixAIOData { > PreallocMode prealloc; > Error **errp; > } truncate; > +struct { > +unsigned int *nr_zones; > +BlockZoneDescriptor *zones; > +} zone_report; > +struct { > +BlockZoneOp op; > +} zone_mgmt; > }; > } RawPosixAIOData; > > @@ -1369,7 +1379,7 @@ static void raw_refresh_limits(BlockDriverState *bs, > Error **errp) > #endif > > if (bs->sg || S_ISBLK(st.st_mode)) { > -int ret = hdev_get_max_hw_transfer(s->fd, &st); > +ret = hdev_get_max_hw_transfer(s->fd, &st); > > if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) { > bs->bl.max_hw_transfer = ret; > @@ -1386,6 +1396,27 @@ static void raw_refresh_limits(BlockDriverState *bs, > Error **err
Re: [PATCH v5 5/8] blockdev: Add a new IF type IF_OTHER
Daniel P. Berrangé writes: > On Thu, Aug 04, 2022 at 05:30:40PM +0200, Markus Armbruster wrote: >> Daniel P. Berrangé writes: >> >> > On Thu, Aug 04, 2022 at 04:56:15PM +0200, Markus Armbruster wrote: >> >> Daniel P. Berrangé writes: >> >> >> >> > On Thu, Jul 28, 2022 at 10:46:35AM +0100, Peter Maydell wrote: >> >> >> On Wed, 27 Jul 2022 at 20:03, Kevin Wolf wrote: >> >> >> > >> >> >> > Am 18.07.2022 um 11:49 hat Markus Armbruster geschrieben: >> >> >> > > An OTP device isn't really a parallel flash, and neither are >> >> >> > > eFuses. >> >> >> > > More fast-and-lose use of IF_PFLASH may exist in the tree, and >> >> >> > > maybe of >> >> >> > > other interface types, too. >> >> >> > > >> >> >> > > This patch introduces IF_OTHER. The patch after next uses it for >> >> >> > > an >> >> >> > > EEPROM device. >> >> >> > > >> >> >> > > Do we want IF_OTHER? >> >> >> > >> >> >> > What would the semantics even be? Any block device that doesn't pick >> >> >> > up >> >> >> > a different category may pick up IF_OTHER backends? >> >> >> > >> >> >> > It certainly feels like a strange interface to ask for "other" disk >> >> >> > and >> >> >> > then getting as surprise what this other thing might be. It's >> >> >> > essentially the same as having an explicit '-device other', and I >> >> >> > suppose most people would find that strange. >> >> >> > >> >> >> > > If no, I guess we get to abuse IF_PFLASH some more. >> >> >> > > >> >> >> > > If yes, I guess we should use IF_PFLASH only for actual parallel >> >> >> > > flash >> >> >> > > memory going forward. Cleaning up existing abuse of IF_PFLASH may >> >> >> > > not >> >> >> > > be worth the trouble, though. >> >> >> > > >> >> >> > > Thoughts? >> >> >> > >> >> >> > If the existing types aren't good enough (I don't have an opinion on >> >> >> > whether IF_PFLASH is a good match), let's add a new one. But a >> >> >> > specific >> >> >> > new one, not just "other". >> >> >> >> >> >> I think the common thread is "this isn't what anybody actually thinks >> >> >> of as being a 'disk', but we would like to back it with a block device >> >> >> anyway". That can cover a fair range of possibilities... >> >> > >> >> > Given that, do we even want/have to use -drive for this ?We can use >> >> > -blockdev for the backend and reference that from any -device we want >> >> > to create, and leave -drive out of the picture entirely >> >> >> >> -drive is our only means to configure onboard devices. >> >> >> >> We've talked about better means a few times, but no conclusions. I can >> >> dig up pointers, if you're interested. >> > >> > For onboard pflash with x86, we've just got properties against the >> > machine that we can point to a blockdev. >> >> True, but the vast majority of onboard block devices doesn't come with >> such properties. Please see >> >> Subject: On configuring onboard block devices with -blockdev (was: [PATCH v4 >> 6/7] hw/nvram: Update at24c EEPROM init function in NPCM7xx boards) >> Date: Mon, 15 Nov 2021 16:28:33 +0100 >> Message-ID: <875ystigke.fsf...@dusky.pond.sub.org> >> https://lists.gnu.org/archive/html/qemu-devel/2021-11/msg03173.html > > My take away from your mail there is that in the absence of better ideas > we should at least use machine properties for anything new we do so we > don't make the problem worse than it already is. It feels more useful > than inventing new IF_xxx possibilities for something we think is the > wrong approach already. The difficulty of providing machine properties for existing devices and the lack of adoption even for new devices make me doubt they are a viable path forward. In the message I referenced above, I wrote: If "replace them all by machine properties" is the way forward, we need to get going. At the current rate of one file a year (measured charitably), we'll be done around 2090, provided we don't add more (we've added quite a few since I did the first replacement). I figure this has slipped to the 22nd century by now. Yes, more uses of -drive are steps backward. But they are trivially easy, and also drops in the bucket. Machine properties are more difficult, and whether they actually take us forward seems doubtful.