11.01.2019 10:54, Vladimir Sementsov-Ogievskiy wrote: > 10.01.2019 23:51, Eric Blake wrote: >> On 1/10/19 7:20 AM, Vladimir Sementsov-Ogievskiy wrote: >>> drv_co_block_status digs bs->file for additional, more accurate search >>> for hole inside region, reported as DATA by bs since 5daa74a6ebc. >> >> s/region, reported/regions reported/ >> >>> >>> This accuracy is not free: assume we have qcow2 disk. Actually, qcow2 >>> knows, where are holes and where is data. But every block_status >> >> Not quite true. qcow2 knows where some holes are, but does not know if >> there are even more holes hiding in the sections marked data (such as >> because of how the disk was pre-allocated). >> >>> request calls lseek additionally. Assume a big disk, full of >>> data, in any iterative copying block job (or img convert) we'll call >>> lseek(HOLE) on every iteration, and each of these lseeks will have to >>> iterate through all metadata up to the end of file. It's obviously >>> ineffective behavior. And for many scenarios we don't need this lseek >>> at all. >> >> How much performance can we buy back without any knobs at all, if we >> just taught posix-file.c to cache lseek() results? That is, when >> visiting a file sequentially, if lseek(fd, 0, SEEK_HOLE) returns EOF on >> our first block status query, then all subsequent block status queries >> fall within what we know to be data, and we can skip the lseek() calls. > > EOF is bad mark I think. We may have a small hole not far from EOF, which > will lead to the same performance, but not EOF returned. > > I think, proper implementation of lseek cache should work, but it is more > difficult than just disable lseek. And in scenarios without preallocation > we don't need lseek, so again better is disable it. > > And I don't sure that qemu is a proper place for lseek optimizations. > > And another way may be to select cases when fiemap is safe and use it. > > Again, for scenarios when we don't need nor lseek, nor fiemap, neither > cache for them the best option is not use them. > >> >>> >>> So, let's "5daa74a6ebc" by default, leaving an option to return >> >> s/let's/let's undo/ >> >>> previous behavior, which is needed for scenarios with preallocated >>> images. >>> >>> Add iotest illustrating new option semantics. >> >> And none of the existing iotests fail due to changed output? Does that >> mean our testsuite does not have good coverage of pre-allocation modes >> where the zero probe mattered? > > To be honest, I didn't run all the tests, only several. > > Do it now, and found that, yes, at least 102 broken. Will fix it with the > following > version. Strange, do patchew run tests on patches in list now? > >> >>> >>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com> >>> --- >>> >> >>> +++ b/qapi/block-core.json >>> @@ -3673,6 +3673,8 @@ >>> # (default: off) >>> # @force-share: force share all permission on added nodes. >>> # Requires read-only=true. (Since 2.10) >>> +# @probe-zeroes: Probe zeroes on protocol layer if format reports data >>> +# (default: false) (since 4.0) >> >> Why do we need a new bool? Can we instead... >> >>> # >>> # Remaining options are determined by the block driver. >>> # >>> @@ -3686,7 +3688,8 @@ >>> '*read-only': 'bool', >>> '*auto-read-only': 'bool', >>> '*force-share': 'bool', >>> - '*detect-zeroes': 'BlockdevDetectZeroesOptions' }, >>> + '*detect-zeroes': 'BlockdevDetectZeroesOptions', >>> + '*probe-zeroes': 'bool' }, >> >> ...add another enum value to 'detect-zeroes', since after all, what we >> are really doing is fine-tuning what level of zero probing we are >> willing to do? > > Should we? Or I just bind old behavior to detect-zeroes=on? And then, if > needed, > we'll add intermediate modes. > >> >> >>> +++ b/qemu-options.hx >>> @@ -615,6 +615,7 @@ DEF("blockdev", HAS_ARG, QEMU_OPTION_blockdev, >>> "-blockdev [driver=]driver[,node-name=N][,discard=ignore|unmap]\n" >>> " [,cache.direct=on|off][,cache.no-flush=on|off]\n" >>> " [,read-only=on|off][,detect-zeroes=on|off|unmap]\n" >>> + " [,probe-zeroes=on|off]\n" >>> " [,driver specific parameters...]\n" >>> " configure a block backend\n", QEMU_ARCH_ALL) >>> STEXI >>> @@ -670,6 +671,8 @@ discard requests. >>> conversion of plain zero writes by the OS to driver specific optimized >>> zero write commands. You may even choose "unmap" if @var{discard} is set >>> to "unmap" to allow a zero write to be converted to an @code{unmap} >>> operation. >>> +@item probe-zeroes=@var{probe-zeroes} >>> +Probe zeroes on protocol layer if format reports data. Default is "off". >> >> Again, fitting this into detect-zeroes seems better than inventing a new >> knob. >> > > No objections. But description of detect-zeroes are more about writes, should > we change them somehow? > > # Describes the operation mode for the automatic conversion of plain > # zero writes by the OS to driver specific optimized zero write commands. > > ... > > # @detect-zeroes: detect and optimize zero writes (Since 2.1) > > >
Hm, just note, that detect-zeroes was about writes and should be set on target, and the new thing is about block-status and should be set on source, and this thing should be described in spec as well. -- Best regards, Vladimir