On 07/12/2011 10:45 AM, Stefan Hajnoczi wrote: > On Tue, Jul 12, 2011 at 9:06 AM, Kevin Wolf <kw...@redhat.com> wrote: >> Am 11.07.2011 18:32, schrieb Marcelo Tosatti: >>> On Mon, Jul 11, 2011 at 03:47:15PM +0100, Stefan Hajnoczi wrote: >>>> Kevin, Marcelo, >>>> I'd like to reach agreement on the QMP/HMP APIs for live block copy >>>> and image streaming. Libvirt has acked the image streaming APIs that >>>> Adam proposed and I think they are a good fit for the feature. I have >>>> described that API below for your review (it's exactly what the QED >>>> Image Streaming patches provide). >>>> >>>> Marcelo: Are you happy with this API for live block copy? Also please >>>> take a look at the switch command that I am proposing. >>>> >>>> Image streaming API >>>> =================== >>>> >>>> For leaf images with copy-on-read semantics, the stream commands allow the >>>> user >>>> to populate local blocks by manually streaming them from the backing image. >>>> Once all blocks have been streamed, the dependency on the original backing >>>> image can be removed. Therefore, stream commands can be used to implement >>>> post-copy live block migration and rapid deployment. >>>> >>>> The block_stream command can be used to stream a single cluster, to >>>> start streaming the entire device, and to cancel an active stream. It >>>> is easiest to allow the block_stream command to manage streaming for the >>>> entire device but a managent tool could use single cluster mode to >>>> throttle the I/O rate. >> >> As discussed earlier, having the management send requests for each >> single cluster doesn't make any sense at all. It wouldn't only throttle >> the I/O rate but bring it down to a level that makes it unusable. What >> you really want is to allow the management to give us a range (offset + >> length) that qemu should stream. > > I feel that an iteration interface is problematic whether the > management tool or QEMU decide what to stream. Let's have just the > background streaming operation. > > The problem with byte ranges is two-fold. The management tool doesn't > know which regions of the image are allocated so it may do a lot of > nop calls to already-allocated regions with no intelligence as to > where the next sensible offset for streaming is. Secondly, because > the progress and performance of image streaming depend largely on > whether or not clusters are allocated (it is very fast when a cluster > is already allocated and we have no work to do), offsets are bad > indicators of progress to the user. I think it's best not to expose > these details to the management tool at all. > > The only reason for the iteration interface was to punt I/O throttling > to the management tool. I think it would be easier to just throttle > inside the streaming function. > > Kevin: Are you happy with dropping the iteration interface? > Adam: Is there a libvirt requirement for iteration or could we support > background copy only?
There is no hard requirement for iteration in libvirt. However, I think there is a requirement that we report some sort of progress to an end user. These operations can easily take many minutes (even hours) and such a long-running operation needs to report progress. I think the current information returned by 'query-block-stream' is appropriate for this purpose and should definitely be maintained. -- Adam Litke IBM Linux Technology Center