Re: [Qemu-devel] VMDK development plan for Summer of Code 2011

Stefan Hajnoczi Mon, 06 Jun 2011 03:58:58 -0700

On Mon, Jun 6, 2011 at 10:50 AM, Kevin Wolf <kw...@redhat.com> wrote:
> Am 02.06.2011 00:11, schrieb Stefan Hajnoczi:
>> On Wed, Jun 1, 2011 at 10:13 AM, Alexander Graf <ag...@suse.de> wrote:
>>>
>>> On 01.06.2011, at 11:11, Kevin Wolf wrote:
>>>
>>>> Am 01.06.2011 10:49, schrieb Alexander Graf:
>>>>>
>>>>> On 01.06.2011, at 06:29, Stefan Hajnoczi wrote:
>>>>>
>>>>>> On Sun, May 29, 2011 at 2:19 PM, Fam Zheng <famc...@gmail.com> wrote:
>>>>>>> As a project of Google Summer of Code 2011, I'm now working on
>>>>>>> improving VMDK image support. There are many subformats of VMDK
>>>>>>> virtual disk, some of which have separate descriptor file and others
>>>>>>> don't, some allocate space at once and some others grow dynamically,
>>>>>>> some have optional data compression. The current support of VMDK
>>>>>>> format is very limited, i.e. qemu now supports single file images, but
>>>>>>> couldn't recognize the widely used multi-file types. We have planned
>>>>>>> to add such support to VMDK block driver and enable more image types,
>>>>>>> and the working timeline is set in weeks (#1 to #7) as:
>>>>>>>
>>>>>>> [#1] Monolithic flat layout support
>>>>>>> [#2] Implement compression and Stream-Optimized Compressed Sparse
>>>>>>> Extents support.
>>>>>>> [#3] Improve ESX Server Sparse Extents support.
>>>>>>> [#4] Debug and test. Collect virtual disks with various versions and
>>>>>>> options, test qemu-img with them. By now some patches may be ready to
>>>>>>> deliver.
>>>>>>> [#5, 6] Add multi-file support (2GB extent formats)
>>>>>>> [#7] Clean up and midterm evaluation.
>>>>>>
>>>>>> Thanks to Fam's work, we'll hopefully support the latest real-world
>>>>>> VMDK files in qemu-img convert within the next few months.
>>>>>>
>>>>>> If anyone has had particular VMDK "problem files" which qemu-img
>>>>>> cannot handle, please reply, they would make interesting test cases.
>>>>>
>>>>> There is one very useful use-case of VMDK files that we currently don't 
>>>>> support: remapping.
>>>>>
>>>>> A vmdk file can specify that it really is backed by a raw block device, 
>>>>> but only for certain chunks, while other chunks of it can be mapped 
>>>>> read-only or zero. That is very useful when passing in a host disk to the 
>>>>> guest and you want to be sure that you don't break other partitions than 
>>>>> the one you're playing with.
>>>>>
>>>>> It can also shadow map those chunks. For example on the case above, the 
>>>>> MBR is COW (IIRC) for the image, so you can install a bootloader in there.
>>>>
>>>> Hm, wondering if that's something to consider for qcow2v3, too... Do you
>>>> think it's still useful when doing this on a cluster granularity? It
>>>> would only work for well-aligned partitions then, but I think that
>>>> shouldn't be a problem for current OSes.
>>>
>>> Well, we could always just hack around for bits where it overlaps. When 
>>> passing in a differently aligned partition for example, we could just 
>>> declare the odd sector as COW sector and copy the contents over :). Though 
>>> that might not be what the user really wants. Hrm.
>>>
>>>> Basically, additionally to the three cluster types "read from this
>>>> image", "COW from backing file" and "zero cluster" we could introduce a
>>>> fourth one "read/write to backing file".
>>>
>>> Yup, sounds very much straight forward! Then all we need is some tool to 
>>> create such a qcow file :)
>>
>> If we want to implement mini-device mapper why not do it as a separate
>> BlockDriver?  This could be useful for non-qcow2 cases like *safely*
>> passing through a physical disk with a guarantee that you won't
>> destroy the MBR.  Also if we do it outside of an image format we don't
>> need to worry about clusters and can do sector-granularity mapping.
>>
>> In fact, if we want mini-device mapper, that could be used to
>> implement the VMDK multi-file support too.  So if Fam writes a generic
>> BlockDriver extent mapper we can use it from VMDK but also from
>> command-line options that tie together qcow2, qed, raw, etc images.
>
> Does it really work for Alex' case, where you have some parts of an
> image file that you want to be COW and other parts that write directly
> to the backing file?
>
> Or to put it in a more general way: Does it work when you reference an
> image more than once? Wouldn't you have to open the same image twice?


Here is an example of booting from a physical disk:

[mbr][/dev/zero][/dev/sda]

mbr is a COW image based on /dev/sda.

/dev/zero is used to protect the first partition would be.  The guest
only sees zeroes and writes are ignored because the guest should never
access this region.

/dev/sda is the extent containing the second partition (actually we
could just open /dev/sda2).

Here we have the case that you mentioned with /dev/sda open as the
read-only backing file for mbr and as read-write for the second
partition.  The question is are raw images safe for multiple opens
when at least one is read-write?  I think the answer for raw is yes.
It is not safe to open non-raw image files multiple times.

I'm also wondering if the -blockdev backing_file=<backing> option that
has been discussed could be used in non-raw cases.  Instead of opening
backing files by name, specify the backing file block device on the
command-line so that the same BlockDriverState is shared, avoiding
inconsistencies.

The multiple opener issue is orthogonal to device mapper support.

Stefan

Re: [Qemu-devel] VMDK development plan for Summer of Code 2011

Reply via email to