On Mon, Sep 26, 2011 at 5:30 PM, Stefan Hajnoczi <stefa...@linux.vnet.ibm.com> wrote: > On Mon, Sep 26, 2011 at 05:11:00PM +0800, Zhi Yong Wu wrote: >> On Mon, Sep 26, 2011 at 3:55 PM, Stefan Hajnoczi <stefa...@gmail.com> wrote: >> > On Mon, Sep 26, 2011 at 01:32:34PM +0800, Zhi Yong Wu wrote: >> >> On Fri, Sep 23, 2011 at 11:57 PM, Stefan Hajnoczi >> >> <stefa...@linux.vnet.ibm.com> wrote: >> >> > Here is my generic image streaming branch, which aims to provide a way >> >> > to copy the contents of a backing file into an image file of a running >> >> > guest without requiring specific support in the various block drivers >> >> > (e.g. qcow2, qed, vmdk): >> >> > >> >> > http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/image-streaming-api >> >> > >> >> > The tree does not provide full image streaming yet but I'd like to >> >> > discuss the approach taken in the code. Here are the main points: >> >> > >> >> > The image streaming API is available through HMP and QMP commands. When >> >> > streaming is started on a block device a coroutine is created to do the >> >> > background I/O work. The coroutine can be cancelled. >> >> > >> >> > While the coroutine copies data from the backing file into the image >> >> > file, the guest may be performing I/O to the image file. Guest reads do >> >> > not conflict with streaming but guest writes require special handling. >> >> > If the guest writes to a region of the image file that we are currently >> >> > copying, then there is the potential to clobber the guest write with old >> >> > data from the backing file. >> >> > >> >> > Previously I solved this in a QED-specific way by taking advantage of >> >> > the serialization of allocating write requests. In order to do this >> >> > generically we need to track in-flight requests and have the ability to >> >> > queue I/O. Guest writes that affect an in-flight streaming copy >> >> > operation must wait for that operation to complete before being issued. >> >> > Streaming copy operations must skip overlapping regions of guest writes. >> >> > >> >> > One big difference to the QED image streaming implementation is that >> >> > this generic implementation is not based on copy-on-read operations. >> >> > Instead we do a sequence of bdrv_is_allocated() to find regions for >> >> > streaming, followed by bdrv_co_read() and bdrv_co_write() in order to >> >> > populate the image file. >> >> > >> >> > It turns out that generic copy-on-read is not an attractive operation >> >> > because it requires using bounce buffers for every request. Kevin >> >> bounce buffers == buffer ring? >> > >> > A bounce buffer is a temporary buffer that is used because the actual >> > data buffer is not addressable or cannot be directly accessed for some >> > other reason. In this case it's because the guest should see read >> > semantics and not find that writes to its read data buffer result in >> > writes to disk. >> > >> >> > pointed out the case where a guest performs a read and pokes the data >> >> > buffer before the read completes, copy-on-read would write out the >> >> > modified memory into the image file unless we use a bounce buffer. >> Sorry, to be honest, i don't know which scenario will cause guest >> modified memory is written out into image file. > > I showed the scenario in the steps posted below: > >> >> Can you elaborate this? >> > >> > 1. Guest issues a read request. >> > 2. QEMU issues host read request as first step in copy-on-read. >> > 3. Host read request completes... >> > 4. Guest overwrites its data buffer before QEMU acknowledges request >> > completion. >> > 5. ...QEMU issues host write request. >> > 6. Host completes write request and QEMU acknowledges guest read >> > completion. >> Good, thanks. >> > >> > What happened is that we populated the image file with data from guest >> > memory that does not match what is in the backing file. The guest >> How to find that the two data don't match? > > Reread what I posted and think about the case where a QEMU read buffer > (the "bounce buffer") is used in step 2. In that case the guest cannot > tamper with the data buffer while performing copy-on-read. Got it now, thanks. > > Stefan >
-- Regards, Zhi Yong Wu