On Sun, Jan 23, 2011 at 11:40 PM, Anthony Liguori <anth...@codemonkey.ws> wrote: > On 01/22/2011 03:29 AM, Stefan Hajnoczi wrote: >> >> Converting qcow2 to use coroutines is fairly simple since most of qcow2 >> is synchronous. The synchronous I/O functions likes bdrv_pread() now >> transparently work when called from a coroutine, so all the synchronous >> code just works. >> >> The explicitly asynchronous code is adjusted to repeatedly call >> qcow2_aio_read_cb() or qcow2_aio_write_cb() until the request completes. >> At that point the coroutine will return from its entry function and its >> resources are freed. >> >> The bdrv_aio_readv() and bdrv_aio_writev() user callback is now invoked >> from a BH. This is necessary since the user callback code does not >> expect to be executed from a coroutine. >> >> This conversion is not completely correct because the safety the >> synchronous code does not carry over to the coroutine version. >> Previously, a synchronous code path could assume that it will never be >> interleaved with another request executing. This is no longer true >> because bdrv_pread() and bdrv_pwrite() cause the coroutine to yield and >> other requests can be processed during that time. >> >> The solution is to carefully introduce checks so that pending requests >> do not step on each other's toes. That is left for a future patch... >> >> Signed-off-by: Stefan Hajnoczi<stefa...@linux.vnet.ibm.com> >> > > As an alternative approach, could we trap async calls from the block device, > implement them in a synchronous fashion, then issue the callback > immediately? > > This would mean that qcow_aio_write() would become fully synchronous which > also means that you can track when the operation is completed entirely > within the block layer. IOW, it should be possible to do this with almost > no change to qcow2.
I'm not sure I understand what you're suggesting. Right now bdrv_read() for coroutines is implemented on top of bdrv_aio_readv(). And bdrv_pread() is implemented on top of bdrv_read(). It doesn't make sense to me to implement bdrv_aio_readv() in terms of bdrv_read(). Also is it safe to invoke the callback without a BH? > I think this is the right approach too. If we're using coroutines, we > shouldn't do anything asynchronous in the image formats. The good bit about > this is that we can probably dramatically simplify the block layer API but > eliminating the sync/async versions of everything. Hardware emulation needs the asynchronous API so I don't think we can get rid of bdrv_aio_readv(), bdrv_aio_writev(), and bdrv_aio_flush() completely. IDE and SCSI also like to be able to cancel their aio requests. Non-invasive coroutines support in the block layer will allow us to easily make the more obscure image formats asynchronous too :). Stefan