Nir Soffer <nir...@gmail.com> writes: > When the `read-zeroes` is set, reads produce zeroes, and block status > return BDRV_BLOCK_ZERO, emulating a sparse image. > > If we don't set `read-zeros` we report BDRV_BLOCK_DATA, but image data > is undefined; posix_memalign, _aligned_malloc, valloc, or memalign do > not promise to zero allocated memory. > > When computing a blkhash of an image via qemu-nbd, we want to test 3 > cases: > > 1. Sparse image: skip reading the entire image based on block status > result, and use a pre-computed zero block hash. > 2. Image full of zeroes: read the entire image, detect block full of > zeroes and skip block hash computation. > 3. Image full of data: read the entire image and compute a hash of all > blocks. > > This change adds `read-pattern` option. If the option is set, reads > produce the specified pattern. With this option we can emulate an image > full of zeroes or full of non-zeroes. > > The following examples shows how the new option can be used with blksum > (or nbdcopy --blkhash) to compute a blkhash of an image using the > null-co driver. > > Sparse image - the very fast path: > > % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/sparse.sock \ > "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', > 'read-zeroes': true}}" & > > % time blksum 'nbd+unix:///?socket=/tmp/sparse.sock' > 300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6 > nbd+unix:///?socket=/tmp/sparse.sock > blksum 'nbd+unix:///?socket=/tmp/sparse.sock' 0.05s user 0.01s system > 92% cpu 0.061 total > > Image full of zeros - same hash, 268 times slower: > > % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/zero.sock \ > "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', > 'read-pattern': 0}}" & > > % time blksum 'nbd+unix:///?socket=/tmp/zero.sock' > 300ad1efddb063822fea65ae3174cd35320939d4d0b050613628c6e1e876f8f6 > nbd+unix:///?socket=/tmp/zero.sock > blksum 'nbd+unix:///?socket=/tmp/zero.sock' 7.45s user 22.57s system > 183% cpu 16.347 total > > Image full of data - difference hash, heavy cpu usage: > > % ./qemu-nbd -r -t -e 0 -f raw -k /tmp/data.sock \ > "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'size': '100g', > 'read-pattern': 255}}" & > > % time blksum 'nbd+unix:///?socket=/tmp/data.sock' > 2c122b3ed28c83ede3c08485659fa9b56ee54ba1751db74d8ba9aa13d9866432 > nbd+unix:///?socket=/tmp/data.sock > blksum 'nbd+unix:///?socket=/tmp/data.sock' 46.05s user 14.15s system > 448% cpu 13.414 total > > Specifying both `read-zeroes` and `read-pattern` is an error since > `read-zeroes` implies a sparse image. Example errors: > > % ./qemu-img map --output json \ > "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'read-pattern': > -1}}" > qemu-img: Could not open 'json:{...}': read_pattern is out of range > (0-255) > > % ./qemu-img map --output json \ > "json:{'driver': 'raw', 'file': {'driver': 'null-co', 'read-pattern': > 0, 'read-zeroes': true}}" > qemu-img: Could not open 'json:{...}': The parameters read-zeroes and > read-pattern are in conflict > > Tested on top of > https://lists.gnu.org/archive/html/qemu-devel/2025-04/msg05096.html. > > Signed-off-by: Nir Soffer <nir...@gmail.com>
[...] > diff --git a/docs/devel/secure-coding-practices.rst > b/docs/devel/secure-coding-practices.rst > index 0454cc527e..73830684ea 100644 > --- a/docs/devel/secure-coding-practices.rst > +++ b/docs/devel/secure-coding-practices.rst > @@ -111,5 +111,6 @@ Use of null-co block drivers > The ``null-co`` block driver is designed for performance: its read accesses > are > not initialized by default. In case this driver has to be used for security > research, it must be used with the ``read-zeroes=on`` option which fills read > -buffers with zeroes. Security issues reported with the default > +buffers with zeroes, or with the ``read-pattern=N`` option which fills read > +buffers with pattern. Security issues reported with the default > (``read-zeroes=off``) will be discarded. > diff --git a/qapi/block-core.json b/qapi/block-core.json > index 7c95c9e36a..2205ac9758 100644 > --- a/qapi/block-core.json > +++ b/qapi/block-core.json > @@ -3295,13 +3295,23 @@ > # > # @read-zeroes: if true, emulate a sparse image, and reads from the > # device produce zeroes; if false, emulate an allocated image but > -# reads from the device leave the buffer unchanged. > +# reads from the device leave the buffer unchanged. Mutually > +# exclusive with @read-pattern. > # (default: false; since: 4.1) Correct before the patch: absent behaves just like present and false. That's no longer the case afterwards. Hmm. > # > +# @read-pattern: if set, emulate an allocated image, and reads from the > +# device produce the specified byte value; if unset, reads from the > +# device leave the buffer unchanged. Mutually exclusive with > +# @read-zeroes. > +# (since: 10.1) Default? The usual way to document it would be something like (default: false), but that's again problematic. > +# > # Since: 2.9 > ## > { 'struct': 'BlockdevOptionsNull', > - 'data': { '*size': 'int', '*latency-ns': 'uint64', '*read-zeroes': 'bool' > } } > + 'data': { '*size': 'int', > + '*latency-ns': 'uint64', > + '*read-zeroes': 'bool', > + '*read-pattern': 'uint8' } } > > ## > # @BlockdevOptionsNVMe: Let's take a step back from the concrete interface and ponder what we're trying to do. We want three cases: * Allocated, reads return unspecified crap (security hazard) * Allocated, reads return specified data * Sparse, reads return zeroes How would we do this if we started on a green field? Here's my try: bool sparse uint8 contents so that * Allocated, reads return unspecified crap (security hazard) @sparse is false, and @contents is absent * Allocated, reads return specified data @sparse is false, and @contents is present * Sparse, reads return zeroes @sparse is true, and @contents must absent or zero I'd make @sparse either mandatory or default to true, so that we don't default to security hazard. Now compare this to your patch: same configuration data (bool × uint8), better names, cleaner semantics, better defaults. Unless we want to break compatibility, we're stuck with the name @read-zeroes for the bool, and its trap-for-the-unwary default value, but cleaner semantics seem possible. Thoughts?