Here is my generic image streaming branch, which aims to provide a way to copy the contents of a backing file into an image file of a running guest without requiring specific support in the various block drivers (e.g. qcow2, qed, vmdk):
http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/image-streaming-api The tree does not provide full image streaming yet but I'd like to discuss the approach taken in the code. Here are the main points: The image streaming API is available through HMP and QMP commands. When streaming is started on a block device a coroutine is created to do the background I/O work. The coroutine can be cancelled. While the coroutine copies data from the backing file into the image file, the guest may be performing I/O to the image file. Guest reads do not conflict with streaming but guest writes require special handling. If the guest writes to a region of the image file that we are currently copying, then there is the potential to clobber the guest write with old data from the backing file. Previously I solved this in a QED-specific way by taking advantage of the serialization of allocating write requests. In order to do this generically we need to track in-flight requests and have the ability to queue I/O. Guest writes that affect an in-flight streaming copy operation must wait for that operation to complete before being issued. Streaming copy operations must skip overlapping regions of guest writes. One big difference to the QED image streaming implementation is that this generic implementation is not based on copy-on-read operations. Instead we do a sequence of bdrv_is_allocated() to find regions for streaming, followed by bdrv_co_read() and bdrv_co_write() in order to populate the image file. It turns out that generic copy-on-read is not an attractive operation because it requires using bounce buffers for every request. Kevin pointed out the case where a guest performs a read and pokes the data buffer before the read completes, copy-on-read would write out the modified memory into the image file unless we use a bounce buffer. There are a few pieces missing in my tree, which have mostly been solved in other places and just need to be reused: 1. Arbitration between guest and streaming requests (this is the only real new thing). 2. Efficient zero handling (skip writing those regions or mark them as zero clusters). 3. Queuing/dependencies when arbitration decides a request must wait. I'm taking a look at reusing Zhi Yong's block queue. 4. Rate-limiting to ensure streaming I/O does not impact the guest. Already exists in the QED-specific patches, it may make sense to extract common code that both migration and the block layer can use. Ideas or questions? Stefan