On Tue, Apr 08, 2025 at 04:30:38PM +0530, prashant patil wrote:
>  Thank you, Eric, for the thorough information—truly appreciate it.
> 
> Just to confirm what I understood, when we are reading a bitmap with
> 'x-dirty-bitmap' (for powered on vm of course), the 'start' is always a
> logical offset no matter whether the record has 'offset' value or not. Is
> this correct?

Whether you are querying dirty bitmaps (x-dirty-bitmap on command
line) or normal allocation (omitted), yes, the 'start' lists the
logical offset of each extent listed, where the extents correspond to
the offsets that a read over the same connection would access.

> 
> Also, I came across a case wherein we get the entire disk as allocated for
> a raw format disk which is present on lvm or lvm-thin storage (the disk has
> just a few MB data added, and the vm is in running state). Here is an
> example of 1Gb data. Is this expected behaviour?
> [{ "start": 0, "length": 1073741824, "depth": 0, "present": true, "zero":
> false, "data": true, "compressed": false, "offset": 0}]

For raw images, the ability to report holes depends on how well
lseek(SEEK_DATA) works; this is filesystem dependent (for example, for
the longest time, tmpfs had O(n) rather than O(1) performance for a
single call, making an lseek() map of the extents of the entire file
an untenable O(n^2) effort, so we purposefully avoid lseek when it is
not known to be efficient).  I would LOVE it if the kernel supported
lseek(SEEK_DATA) on block devices - in fact, here's a patch series
that Stefan started where we debated what that might look like,
although it never gained any traction at the time:
https://lore.kernel.org/lkml/20240328203910.2370087-1-stefa...@redhat.com/

It may be also possible for qemu to use ioctls to probe block device
extents when lseek() doesn't directly work, but patches would have to
be submitted, and that won't scale as well as having the kernel report
the information up front to all interested users, rather than patching
each client to learn the right ioctls to work around the kernel's lack
of a unified interface.

So in the short term, yes, it is reasonable to expect that qemu is not
able to report where the sparse regions of an lvm block device are.
Note that something reported as full data is always accurate, even if
inefficient.

One other side note - a few months back, I was working on a potential
project to write a CSI driver that used lvm devices, and was working
on what it would take for that CSI driver to expose the
'GetMetadataAllocated' and 'GetMetadataDelta' gRPC calls.  lvm code
did not, at the time, provide any convenient way to list which
portions of a thin volume were directly allocated or which were dirty
in relation to a prior snapshot.  There might be some hacks you can do
with device-mapper code to get at that, or newer versions of lvm code
might add something along those lines; but that was another place
where I would have loved to have a kernel interface for letting
seek(SEEK_DATA) expose where the allocations live.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org


Reply via email to