Ping? On 2019/4/15 10:39, Xiang Zheng wrote: > On 2019/4/12 18:57, Kevin Wolf wrote: >> Am 12.04.2019 um 11:50 hat Xiang Zheng geschrieben: >>> >>> On 2019/4/12 9:52, Xiang Zheng wrote: >>>> On 2019/4/11 20:22, Kevin Wolf wrote: >>>>> Okay, so your problem is that blk_pread() writes to the whole buffer, >>>>> writing explicit zeroes for unallocated parts of the image, while you >>>>> would like to leave those parts of the buffer untouched so that we don't >>>>> actually allocate the memory, but can just use the shared zero page. >>>>> >>>>> If you just want to read the non-zero parts of the image, that can be >>>>> done by using a loop that calls bdrv_block_status() and only reads from >>>>> the image if the BDRV_BLOCK_ZERO bit is clear. >>>>> >>>>> Would this solve your problem? >>>> >>>> Sounds good! What if guest tried to read/write the zero parts? >>>> >>> >>> I wrote the below patch (refer to bdrv_make_zero()) for test, it seems >>> that everything is OK and the memory is also exactly allocated on demand. >>> >>> This requires pflash devices to use sparse files backend. Thus I have to >>> create images like: >>> >>> dd of="QEMU_EFI-pflash.raw" if="/dev/zero" bs=1M seek=64 count=0 >>> dd of="QEMU_EFI-pflash.raw" if="QEMU_EFI.fd" conv=notrunc >>> >>> dd of="empty_VARS.fd" if="/dev/zero" bs=1M seek=64 count=0 >>> >>> >>> ---8>--- >>> >>> diff --git a/block/block-backend.c b/block/block-backend.c >>> index f78e82a..ed8ca87 100644 >>> --- a/block/block-backend.c >>> +++ b/block/block-backend.c >>> @@ -1379,6 +1379,12 @@ BlockAIOCB *blk_aio_pwrite_zeroes(BlockBackend *blk, >>> int64_t offset, >>> flags | BDRV_REQ_ZERO_WRITE, cb, opaque); >>> } >>> >>> +int blk_pread_nonzeroes(BlockBackend *blk, void *buf) >>> +{ >>> + int ret = bdrv_pread_nonzeroes(blk->root, buf); >>> + return ret; >>> +} >> >> I don't think this deserves a place in the public block layer interface, >> as it's only a single device that makes use of it. >> >> Maybe you wrote things this way because there is no blk_block_status(), >> but you can get the BlockDriverState with blk_bs(blk) and then implement >> everything inside hw/block/block.c. > > Yes, you are right. > >> >>> int blk_pread(BlockBackend *blk, int64_t offset, void *buf, int count) >>> { >>> int ret = blk_prw(blk, offset, buf, count, blk_read_entry, 0); >>> diff --git a/block/io.c b/block/io.c >>> index dfc153b..83e5ea7 100644 >>> --- a/block/io.c >>> +++ b/block/io.c >>> @@ -882,6 +882,38 @@ int bdrv_pwrite_zeroes(BdrvChild *child, int64_t >>> offset, >>> BDRV_REQ_ZERO_WRITE | flags); >>> } >>> >>> +int bdrv_pread_nonzeroes(BdrvChild *child, void *buf) >>> +{ >>> + int ret; >>> + int64_t target_size, bytes, offset = 0; >>> + BlockDriverState *bs = child->bs; >>> + >>> + target_size = bdrv_getlength(bs); >>> + if (target_size < 0) { >>> + return target_size; >>> + } >>> + >>> + for (;;) { >>> + bytes = MIN(target_size - offset, BDRV_REQUEST_MAX_BYTES); >>> + if (bytes <= 0) { >>> + return 0; >>> + } >>> + ret = bdrv_block_status(bs, offset, bytes, &bytes, NULL, NULL); >>> + if (ret < 0) { >>> + return ret; >>> + } >>> + if (ret & BDRV_BLOCK_ZERO) { >>> + offset += bytes; >>> + continue; >>> + } >>> + ret = bdrv_pread(child, offset, buf, bytes); >>> + if (ret < 0) { >>> + return ret; >>> + } >>> + offset += bytes; >> >> I think the code becomes simpler the other way round: >> >> if (!(ret & BDRV_BLOCK_ZERO)) { >> ret = bdrv_pread(child, offset, buf, bytes); >> if (ret < 0) { >> return ret; >> } >> } >> offset += bytes; >> >> You don't increment buf, so if you have a hole in the file, this will >> corrupt the buffer. You need to either increment buf, too, or use >> (uint8_t*) buf + offset for the bdrv_pread() call. >> > > Yes, I didn't notice it. I think the latter is better. Does *BDRV_BLOCK_ZERO* > mean that there are all-zeroes data or a hole in the sector? But if I use an > image filled with zeroes, it will not set BDRV_BLOCK_ZERO bit on return. > > Should I resend a patch? > > ---8>--- > >>From 4dbfe4955aa9fe23404cbe1890fbe148be2ff10e Mon Sep 17 00:00:00 2001 > From: Xiang Zheng <zhengxia...@huawei.com> > Date: Sat, 13 Apr 2019 02:27:03 +0800 > Subject: [PATCH] pflash: Only read non-zero parts of backend image > > Currently we fill the VIRT_FLASH memory space with two 64MB NOR images > when using persistent UEFI variables on virt board. Actually we only use > a very small(non-zero) part of the memory while the rest significant > large(zero) part of memory is wasted. > > So this patch checks the block status and only writes the non-zero part > into memory. This requires pflash devices to use sparse files for > backends. > > Signed-off-by: Xiang Zheng <zhengxia...@huawei.com> > --- > hw/block/block.c | 40 +++++++++++++++++++++++++++++++++++++++- > 1 file changed, 39 insertions(+), 1 deletion(-) > > diff --git a/hw/block/block.c b/hw/block/block.c > index bf56c76..3cb9d4c 100644 > --- a/hw/block/block.c > +++ b/hw/block/block.c > @@ -15,6 +15,44 @@ > #include "qapi/qapi-types-block.h" > > /* > + * Read the non-zeroes parts of @blk into @buf > + * Reading all of the @blk is expensive if the zeroes parts of @blk > + * is large enough. Therefore check the block status and only write > + * the non-zeroes block into @buf. > + * > + * Return 0 on success, non-zero on error. > + */ > +static int blk_pread_nonzeroes(BlockBackend *blk, void *buf) > +{ > + int ret; > + int64_t target_size, bytes, offset = 0; > + BlockDriverState *bs = blk_bs(blk); > + > + target_size = bdrv_getlength(bs); > + if (target_size < 0) { > + return target_size; > + } > + > + for (;;) { > + bytes = MIN(target_size - offset, BDRV_REQUEST_MAX_SECTORS); > + if (bytes <= 0) { > + return 0; > + } > + ret = bdrv_block_status(bs, offset, bytes, &bytes, NULL, NULL); > + if (ret < 0) { > + return ret; > + } > + if (!(ret & BDRV_BLOCK_ZERO)) { > + ret = bdrv_pread(bs->file, offset, (uint8_t *) buf + offset, > bytes); > + if (ret < 0) { > + return ret; > + } > + } > + offset += bytes; > + } > +} > + > +/* > * Read the entire contents of @blk into @buf. > * @blk's contents must be @size bytes, and @size must be at most > * BDRV_REQUEST_MAX_BYTES. > @@ -53,7 +91,7 @@ bool blk_check_size_and_read_all(BlockBackend *blk, void > *buf, hwaddr size, > * block device and read only on demand. > */ > assert(size <= BDRV_REQUEST_MAX_BYTES); > - ret = blk_pread(blk, 0, buf, size); > + ret = blk_pread_nonzeroes(blk, buf); > if (ret < 0) { > error_setg_errno(errp, -ret, "can't read block backend"); > return false; > --
Thanks, Xiang