> On 29 Apr 2025, at 0:55, Eric Blake <ebl...@redhat.com> wrote:
>
> On Mon, Apr 28, 2025 at 01:59:00AM +0300, Nir Soffer wrote:
>> When the `read-zeroes` is set, reads produce zeroes, and block status
>> return BDRV_BLOCK_ZERO, emulating a sparse image.
>>
>> If we don't set `read-zeros` we report BDRV_BLOCK_DATA, but image data
>> is undefined; posix_memalign, _aligned_malloc, valloc, or memalign do
>> not promise to zero allocated memory.
>>
>> When computing a blkhash of an image via qemu-nbd, we want to test 3
>> cases:
>>
>> 1. Sparse image: skip reading the entire image based on block status
>> result, and use a pre-computed zero block hash.
>> 2. Image full of zeroes: read the entire image, detect block full of
>> zeroes and skip block hash computation.
>> 3. Image full of data: read the entire image and compute a hash of all
>> blocks.
>>
>> This change adds `read-pattern` option. If the option is set, reads
>> produce the specified pattern. With this option we can emulate an image
>> full of zeroes or full of non-zeroes.
>>
>> Specifying both `read-zeroes` and `read-pattern != 0` is not useful
>> since `read-zeroes` implies a sparse image. In this case `read-zeroes`
>> wins and we ignore the pattern. Maybe we need to make the options mutual
>> exclusive.
>
> I would lean towards an error. It's easier to remove an error later
> if we find it was too strict, than to be lax up front and then regret
> it down the road when we wish we had been more strict.
Good point
>
>>
>> The following examples shows how the new option can be used with blksum
>> (or nbdcopy --blkhash) to compute a blkhash of an image using the
>> null-co driver.
>>
>> Sparse image - the very fast path:
>>
>> % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/sparse.sock \
>> "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g',
>> 'read-zeroes': true}}" &
>>
>> % time blksum 'nbd+unix:///?socket=/tmp/sparse.sock'
>> 300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6
>> nbd+unix:///?socket=/tmp/sparse.sock
>> blksum 'nbd+unix:///?socket=/tmp/sparse.sock' 0.05s user 0.01s system
>> 92% cpu 0.061 total
>>
>> Image full of zeros - same hash, 268 times slower:
>>
>> % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/zero.sock \
>> "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g',
>> 'read-pattern': 0}}" &
>>
>> % time blksum 'nbd+unix:///?socket=/tmp/zero.sock'
>> 300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6
>> nbd+unix:///?socket=/tmp/zero.sock
>> blksum 'nbd+unix:///?socket=/tmp/zero.sock' 7.45s user 22.57s system
>> 183% cpu 16.347 total
>>
>> Image full of data - difference hash, heavy cpu usage:
>>
>> % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/data.sock \
>> "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g',
>> 'read-pattern': -1}}" &
>>
>> % time blksum 'nbd+unix:///?socket=/tmp/data.sock'
>> 2c122b3ed28c83ede3c08485659fa9b56ee54ba1751db74d8ba9aa13d9866432
>> nbd+unix:///?socket=/tmp/data.sock
>> blksum 'nbd+unix:///?socket=/tmp/data.sock' 46.05s user 14.15s system
>> 448% cpu 13.414 total
>>
>> Tested on top of
>> https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg05096.html.
>>
>> Signed-off-by: Nir Soffer <nir...@gmail.com>
>> ---
>> block/null.c | 17 +++++++++++++++++
>> qapi/block-core.json | 9 ++++++++-
>> 2 files changed, 25 insertions(+), 1 deletion(-)
>
> Should block status return ZERO|DATA when read-pattern=0 is present?
This will make it useless for testing image full of zeros which is the reason I
added it. qemu-img convert, nbdcopy, and blksum look only at the zero bit and
skip reading the extent.