On Tue, 09/02 16:09, Stefan Hajnoczi wrote: > On Wed, Aug 27, 2014 at 10:49:08AM +0800, Fam Zheng wrote: > > v3: Drop "RFC". > > Improvements according to Paolo's comments: > > 05: Just use THREAD_DONE and ret = -ECANCELED in thread-pool.c > > 06: Don't check dbs->cancelled for twice. > > Don't set dbs->acb to NULL. > > > > v2: Drop the unfinished scsi part, which was broken in v1. (Paolo) > > Add refcnt in BlockDriverAIOCB to maintain invariant of acb > > availability in > > bdrv_aio_cancel_async. (Paolo) > > Drop blkdebug change. (Stefan) > > > > This series adds a new block layer API: > > > > void bdrv_aio_cancel_async(BlockDriverAIOCB *acb); > > > > which is similar to existing bdrv_aio_cancel in that it cancels an AIO > > request, > > but different that it doesn't block until the request is completely > > cancelled > > or done. > > > > More importantly, the completion callback, BlockDriverAIOCB.cb, is > > guaranteed > > to be called, so that the cb can take care of resource releasing and status > > reporting to guest, etc. > > > > In the following work, scsi emulation code will be shifted to use the async > > cancelling. > > > > One major benefit would be that when guest tries to cancel a request, where > > the > > request cannot be cancelled easily, (due to throttled BlockDriverState, a > > lost > > connection, or a large request queue), we don't need to block the whole vm > > with > > a busy loop, which is how bdrv_aio_cancel is implemented now. > > > > A test case that is easy to reproduce is, throttle a scsi-disk to a very low > > limit, for example 50 bps, then stress the guest block device with dd or > > fio. > > > > Currently, the vm will quickly hang when it loses patience and send a tmf > > command to cancel the request, at which point we will busy wait in > > bdrv_aio_cancel, until the request is slowly spit out from throttled_reqs. > > > > Later, we will change scsi device code to make this asynchronous, on top of > > bdrv_aio_cancel_async. > > We need to get rid of .bdrv_aio_cancel(). Keeping both > .bdrv_aio_cancel() and .bdrv_aio_cancel_async() around is problematic > because they have slightly different semantics. > > This patch series makes block driver cancellation more complex because > we support both approaches :(. > > Could we do something like: > > void bdrv_aio_cancel(BdrvAIOCB *acb) > { > bdrv_aiocb_ref(acb); > bdrv_aio_cancel_async(acb); > while (acb->ref > 1) { > aio_poll(bdrv_get_aio_context(acb->bs), true); > } > bdrv_aiocb_release(acb); > } > > (pseudo-code) > > Then .bdrv_aio_cancel() should be deleted. >
OK, I will drop .io_cancel field from AIOCBInfo, and convert all AIO implementations to only support .io_cancel_async. That way we can emulate bdrv_aio_cancel with bdrv_aio_cancel_async. Thanks for reviewing! Fam