On 8/12/19 5:03 PM, Max Reitz wrote:
> On 12.08.19 22:26, John Snow wrote:
>>
>>
>> On 7/25/19 11:57 AM, Max Reitz wrote:
>>> Compressed writes generally have to write full clusters, not just in
>>> theory but also in practice when it comes to vmdk's streamOptimized
>>> subformat. It currently is just silently broken for writes with
>>> non-zero in-cluster offsets:
>>>
>>> $ qemu-img create -f vmdk -o subformat=streamOptimized foo.vmdk 1M
>>> $ qemu-io -c 'write 4k 4k' -c 'read 4k 4k' foo.vmdk
>>> wrote 4096/4096 bytes at offset 4096
>>> 4 KiB, 1 ops; 00.01 sec (443.724 KiB/sec and 110.9309 ops/sec)
>>> read failed: Invalid argument
>>>
>>> (The technical reason is that vmdk_write_extent() just writes the
>>> incomplete compressed data actually to offset 4k. When reading the
>>> data, vmdk_read_extent() looks at offset 0 and finds the compressed data
>>> size to be 0, because that is what it reads from there. This yields an
>>> error.)
>>>
>>> For incomplete writes with zero in-cluster offsets, the error path when
>>> reading the rest of the cluster is a bit different, but the result is
>>> the same:
>>>
>>> $ qemu-img create -f vmdk -o subformat=streamOptimized foo.vmdk 1M
>>> $ qemu-io -c 'write 0k 4k' -c 'read 4k 4k' foo.vmdk
>>> wrote 4096/4096 bytes at offset 0
>>> 4 KiB, 1 ops; 00.01 sec (362.641 KiB/sec and 90.6603 ops/sec)
>>> read failed: Invalid argument
>>>
>>> (Here, vmdk_read_extent() finds the data and then sees that the
>>> uncompressed data is short.)
>>>
>>> It is better to reject invalid writes than to make the user believe they
>>> might have succeeded and then fail when trying to read it back.
>>>
>>> Signed-off-by: Max Reitz <mre...@redhat.com>
>>> ---
>>> block/vmdk.c | 10 ++++++++++
>>> 1 file changed, 10 insertions(+)
>>>
>>> diff --git a/block/vmdk.c b/block/vmdk.c
>>> index db6acfc31e..641acacfe0 100644
>>> --- a/block/vmdk.c
>>> +++ b/block/vmdk.c
>>> @@ -1731,6 +1731,16 @@ static int vmdk_write_extent(VmdkExtent *extent,
>>> int64_t cluster_offset,
>>> if (extent->compressed) {
>>> void *compressed_data;
>>>
>>> + /* Only whole clusters */
>>> + if (offset_in_cluster ||
>>> + n_bytes > (extent->cluster_sectors * SECTOR_SIZE) ||
>>> + (n_bytes < (extent->cluster_sectors * SECTOR_SIZE) &&
>>> + offset + n_bytes != extent->end_sector * SECTOR_SIZE))
>>> + {
>>> + ret = -EINVAL;
>>> + goto out;
>>> + }
>>> +
>>> if (!extent->has_marker) {
>>> ret = -EINVAL;
>>> goto out;
>>>
>>
>> What does this look like from a guest's perspective? Is there something
>> that enforces the alignment in the graph for us?
>>
>> Or is it the case that indeed guests (or users via qemu-io) can request
>> invalid writes and we will halt the VM in those cases (in preference to
>> corrupting the disk)?
>
> Have you ever tried using a streamOptimized VMDK disk with a guest?
>
Nope! It's why I'm asking. I have no idea what the whole picture before
and after is.
> I haven’t, but I know that it won’t work. O:-) If you try to write to
> an already allocated cluster, you’ll get an EIO and an error message via
> error_report() (“Could not write to allocated cluster for
> streamOptimized”). So really, the only use of streamOptimized is as a
> qemu-img convert source/target, or as a backup/mirror target. (Just
> like compressed clusters in qcow2 images.)
>
OK, makes sense. Someone's going to try to use it in cases where it
doesn't make sense though, for sure.
> I suppose if I introduced streamOptimized support today, I wouldn’t just
> forward vmdk_co_pwritev_compressed() to vmdk_co_pwritev(), but instead
> make vmdk_co_pwritev_compressed() only work on streamOptimized images,
> and vmdk_co_pwritev() only on everything else. Then it would be more clear.
>
> Hm. In fact, that’s a bug, isn’t it? vmdk will accept compressed
> writes for any subformat, even if it doesn’t support compression. So if
> you use -c and convert to vmdk, it will succeed, but the result won’t be
> compressed,
>
> It’s also a bit weird to accept normal writes for streamOptimized, but
> I’m not sure whether that’s really a bug? In any case, changing this
> behavior would not be backwards-compatible... Should we deprecate
> normal writes to streamOptimized?
>
If it's supposed to be the case that streamOptimized *only* gets
compressed, aligned writes -- it probably is a bug to do normal,
uncompressed writes, isn't it? Does that produce a usable image?
> Max
>
Anyway, I'm fine with this patch because things aren't any worse, and
our support for non-native formats has always been kind of best-attempt.
Reviewed-by: John Snow <js...@redhat.com>