On 25.07.23 18:01, ThinerLogoer wrote:

At 2023-07-25 19:42:30, "David Hildenbrand" <[email protected]> wrote:
Hi,

patch subject should start with "softmmu/physmem: Open ..."

Sorry I am newbie to the patch submission part. I will resubmit a version of 
patch if the
final acceptable patch after discussion is mostly the same. (For example, if 
this patch
finally involves adding another parameter and adding various hooks, then I may 
feel it
hard to finish the patch myself, both due to lack of knowledge of qemu source 
code tree,
and due to lack of various environment to test every case out)

No worries. I'll be happy to guide you. But if you feel more comfortable that I take over, just let me know.


Anyway thanks to all your suggestions.


On 25.07.23 12:52, Thiner Logoer wrote:
An read only file can be mapped with read write as long as the
mapping is private, which is very common case. Make

At least in the environments I know, using private file mappings is a corner 
case ;)

What is you use case? VM templating?

Mostly, if I understand the terminology correct. I was experimenting on vm 
snapshoting
that uses MAP_PRIVATE when recovering memory, similar to what firecracker says 
in this
documentation.

https://github.com/firecracker-microvm/firecracker/blob/main/docs/snapshotting/snapshot-support.md

And in my experiment qemu supports recovering from a memory file + a guest 
state file out
of the box.
In fact, `-mem-path filename4pc.ram` works out of the box (since the default 
parameter is
map_private+readwrite), only that vanilla setup requires memory file to be 
writeable

Oh, you're saying "-mem-path" results in a MAP_PRIVATE mapping? That sounds very nasty :/ It probably was introduced only for hugetlb handling, and wouldn't actually share memory with another process.

In fact, we added MAP_SHARED handling later

commit dbcb8981183592be129b2e624b7bcd4245e75fbc
Author: Paolo Bonzini <[email protected]>
Date:   Tue Jun 10 19:15:24 2014 +0800

    hostmem: add property to map memory with MAP_SHARED

    A new "share" property can be used with the "memory-file" backend to
    map memory with MAP_SHARED instead of MAP_PRIVATE.


Even one doc in docs/devel/multi-process.rst is wrong:

"Note guest memory must be backed by file descriptors, such as when QEMU is given the *-mem-path* command line option."

... no, that won't work with a MAP_PRIVATE mapping.


though the file never gets written. (the actual memory file & guest state file 
require
separated hacking)

And at least the patch provided here have been the solution to this last 
problem for me
for a while.

By the way the commit: "Commit 134253a4, machine: do not crash if default RAM 
backend name
has been stolen" disallows me to use a memory backed file directly as pc.ram 
and make
`-object memory-backed-file,*` based setup more complex (I cannot easily make 
the memory

Can't you simply do

-object memory-backed-file,id=mem1 \
-machine q35,memory-backend=mem1,share=off \

Or what would be the problem with that?

unbacked by any file before snapshoting and backed by file after recovery from 
snapshot
after this patch). This is the reason why I prefer `-mem-path` despite the doc 
tells that
this usage is close to deprecated, and that `-mem-path` has less configurable 
parameters.


qemu_ram_alloc_from_file open file as read only when the
mapping is private, otherwise open will fail when file
does not allow write.

If this file does not exist or is a directory, the flag is not used,
so it should be OK.

from https://gitlab.com/qemu-project/qemu/-/issues/1689

Signed-off-by: Thiner Logoer <[email protected]>
---
   softmmu/physmem.c | 9 ++++++++-
   1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 3df73542e1..e8036ee335 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -1945,8 +1945,15 @@ RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, 
MemoryRegion*mr,
       int fd;
       bool created;
       RAMBlock *block;
+

^

.git/rebase-apply/patch:13: trailing whitespace.

I remembered I have deleted this whitespace before. Obviously I have messed up 
with
different version of patch files, sorry about that ...


No worries :)


+    /*
+     * If map is private, the fd does not need to be writable.
+     * This only get effective when the file is existent.

"This will get ignored if the file does not yet exist."

+     */
+    bool open_as_readonly = readonly || !(ram_flags & RAM_SHARED);

-    fd = file_ram_open(mem_path, memory_region_name(mr), readonly, &created,
+    fd = file_ram_open(mem_path, memory_region_name(mr),
+                       open_as_readonly, &created,
                          errp);
       if (fd < 0) {
           return NULL;


Opening a file R/O will also make operations like fallocate/ftruncate/ ... fail.

I saw fallocate in softmmu/physmem.c on somewhere, though I was not sure how it 
is
actually used. Your response fills in this part.


For example, this will make fallocate(FALLOC_FL_PUNCH_HOLE) stop working and in
turn make ram_block_discard_range() bail out.


There was a recent discussion/patch on that:

commit 1d44ff586f8a8e113379430750b5a0a2a3f64cf9
Author: David Hildenbrand <[email protected]>
Date:   Thu Jul 6 09:56:06 2023 +0200

     softmmu/physmem: Warn with ram_block_discard_range() on MAP_PRIVATE file 
mapping

     ram_block_discard_range() cannot possibly do the right thing in
     MAP_PRIVATE file mappings in the general case.

     To achieve the documented semantics, we also have to punch a hole into
     the file, possibly messing with other MAP_PRIVATE/MAP_SHARED mappings
     of such a file.

     For example, using VM templating -- see commit b17fbbe55cba ("migration:
     allow private destination ram with x-ignore-shared") -- in combination with
     any mechanism that relies on discarding of RAM is problematic. This
     includes:
     * Postcopy live migration
     * virtio-balloon inflation/deflation or free-page-reporting
     * virtio-mem

     So at least warn that there is something possibly dangerous is going on
     when using ram_block_discard_range() in these cases.


I did not expect that multiple qemu features will contradict each other - 
private cow map
of file & user fault fd based on demand memory serving ... (do not blame me too 
much if I
get the terminology wrong - I am no professional qemu dev :D)

Let me rephrase:

"I did not wish that multiple qemu features will contradict each other"

:)



While it doesn't work "in the general case", it works in the "single file 
owner" case
where someone simply forgot to specify "share=on" -- "share=off" is the default 
for
memory-backend-file :( .


For example, with hugetlb+virtio-mem the following works if the file does not 
exists:

(note that virtio-mem will fallocate(FALLOC_FL_PUNCH_HOLE) the whole file 
upfront)

...
-object 
memory-backend-file,share=off,mem-path=/dev/hugepages/vmem0,id=mem2,size=2G \
-device virtio-mem-pci,id=vmem0,memdev=mem2,requested-size=1g,bus=root


With you patch, once the file already exists, we would now get

qemu-system-x86_64: -device
virtio-mem-pci,id=vmem0,memdev=mem2,requested-size=1g,bus=root: 
ram_block_discard_range:
Failed to fallocate :0 +80000000 (-9)
qemu-system-x86_64: -device
virtio-mem-pci,id=vmem0,memdev=mem2,requested-size=1g,bus=root: Unexpected error
discarding RAM: Bad file descriptor


So this has the potential to break existing setups.

The easy fix for these would be to configure "share=on" in these now-failing 
setups. Hmmmmm ....

I am afraid that the easiest prefix could be to configure `share=on` when the 
path starts
with "/dev/huge" while firing a warning :D

I am sorry about that if existing systems will be broken because of my patch ...

I have learnt that mem-path commonly refer to hugetlb/hugepage, but actually I 
have no
idea what is the outcome if hugetlb or anything similar was mapped with 
map_private and
copy-on-write happens - will a whole huge page be copied on write then?

I would suppose that in reality system managers may consider directly remove 
the file
first if the file will be truncated anyway. However t would be a different 
story if this
file should be truncated exactly PARTIALLY.

Alternatively maybe another flag "create=on" can be added when private 
semantics are
required, so that if the file exists, the file should be unlinked or truncated 
first
before using?

Since I am nowhere familiar to this part of qemu source code, it will be hard 
for me to
write the additional command line flag part correct, if this is believed to be 
the correct
solution though.

In summary I am glad to learn more of the backgrounds here.

The easiest way not break any existing setup would be to open the file R/O only if opening it R/W failed due to lack of permissions, and we have a private mapping. So, in case of !RAMP_SHARED, simply retry once more without write permissions.

Would that keep your use-case working?


Back to `-mem-path` part. Now I wonder whether filling the initial value for 
ram is what
`-mem-path` is expected behavior (whether I am using a feature that will be 
deprecated
soon); whether there is a convenient method to filling the initial value in 
copy-on-write
styles if `mem-path` is not good to use; and in general whether a privately 
used memory
backed file SHOULD be writeable.

The case that "-mem-path" has always used MAP_PRIVATE and seemed to have worked with postcopy live migration (another user of fallocate(PUNCH_HOLE)) tells me that we should be careful about that.

--
Cheers,

David / dhildenb


Reply via email to