On Mon, Apr 28, 2025 at 01:59:00AM +0300, Nir Soffer wrote:
> When the `read-zeroes` is set, reads produce zeroes, and block status
> return BDRV_BLOCK_ZERO, emulating a sparse image.
> 
> If we don't set `read-zeros` we report BDRV_BLOCK_DATA, but image data
> is undefined; posix_memalign, _aligned_malloc, valloc, or memalign do
> not promise to zero allocated memory.
> 
> When computing a blkhash of an image via qemu-nbd, we want to test 3
> cases:
> 
> 1. Sparse image: skip reading the entire image based on block status
>    result, and use a pre-computed zero block hash.
> 2. Image full of zeroes: read the entire image, detect block full of
>    zeroes and skip block hash computation.
> 3. Image full of data: read the entire image and compute a hash of all
>    blocks.
> 
> This change adds `read-pattern` option. If the option is set, reads
> produce the specified pattern. With this option we can emulate an image
> full of zeroes or full of non-zeroes.
> 
> Specifying both `read-zeroes` and `read-pattern != 0` is not useful
> since `read-zeroes` implies a sparse image.  In this case `read-zeroes`
> wins and we ignore the pattern. Maybe we need to make the options mutual
> exclusive.

I would lean towards an error.  It's easier to remove an error later
if we find it was too strict, than to be lax up front and then regret
it down the road when we wish we had been more strict.

> 
> The following examples shows how the new option can be used with blksum
> (or nbdcopy --blkhash) to compute a blkhash of an image using the
> null-co driver.
> 
> Sparse image - the very fast path:
> 
>     % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/sparse.sock \
>         "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 
> 'read-zeroes': true}}" &
> 
>     % time blksum 'nbd+unix:///?socket=/tmp/sparse.sock'
>     300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6  
> nbd+unix:///?socket=/tmp/sparse.sock
>     blksum 'nbd+unix:///?socket=/tmp/sparse.sock'  0.05s user 0.01s system 
> 92% cpu 0.061 total
> 
> Image full of zeros - same hash, 268 times slower:
> 
>     % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/zero.sock \
>         "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 
> 'read-pattern': 0}}" &
> 
>     % time blksum 'nbd+unix:///?socket=/tmp/zero.sock'
>     300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6  
> nbd+unix:///?socket=/tmp/zero.sock
>     blksum 'nbd+unix:///?socket=/tmp/zero.sock'  7.45s user 22.57s system 
> 183% cpu 16.347 total
> 
> Image full of data - difference hash, heavy cpu usage:
> 
>     % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/data.sock \
>         "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', 
> 'read-pattern': -1}}" &
> 
>     % time blksum 'nbd+unix:///?socket=/tmp/data.sock'
>     2c122b3ed28c83ede3c08485659fa9b56ee54ba1751db74d8ba9aa13d9866432  
> nbd+unix:///?socket=/tmp/data.sock
>     blksum 'nbd+unix:///?socket=/tmp/data.sock'  46.05s user 14.15s system 
> 448% cpu 13.414 total
> 
> Tested on top of
> https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg05096.html.
> 
> Signed-off-by: Nir Soffer <nir...@gmail.com>
> ---
>  block/null.c         | 17 +++++++++++++++++
>  qapi/block-core.json |  9 ++++++++-
>  2 files changed, 25 insertions(+), 1 deletion(-)

Should block status return ZERO|DATA when read-pattern=0 is present?

> +# @read-pattern: if set, reads from the device produce the specified
> +#     pattern; if unset, the buffer is left unchanged.
> +#     (since: 10.1)
> +#
>  # Since: 2.9
>  ##
>  { 'struct': 'BlockdevOptionsNull',
> -  'data': { '*size': 'int', '*latency-ns': 'uint64', '*read-zeroes': 'bool' 
> } }
> +  'data': { '*size': 'int',
> +            '*latency-ns': 'uint64',
> +            '*read-zeroes': 'bool',
> +            '*read-pattern': 'int' } }

Should this be 'uint8' instead of 'int', so that we aren't silently
truncating spurious upper bits when passing it to memset()?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org


Reply via email to