On 17.08.23 16:41, Peter Xu wrote:
On Thu, Aug 17, 2023 at 11:07:23AM +0200, David Hildenbrand wrote:
@Stefan, see below on a R/O NVDIMM question.
We're discussing how to get MAPR_PRIVATE R/W mapping of a
memory-backend-file running when using R/O files.
This seems a good idea. I am good with the solution you proposed
here as well.
I was just going to get started working on that, when I realized
something important:
"@readonly: if true, the backing file is opened read-only; if false,
it is opened read-write. (default: false)"
"@share: if false, the memory is private to QEMU; if true, it is
shared (default: false)"
So readonly is *all* about the file access mode already ... the mmap()
parameters are just a side-effect of that. Adding a new
"file-access-mode" or similar would be wrong.
Not exactly a side effect, IMHO. IIUC it's simply because we didn't have a
need of using different perm for memory/file levels. See the other patch
commit message from Stefan:
https://lore.kernel.org/all/20210104171320.575838-2-stefa...@redhat.com/
There is currently no way to open(O_RDONLY) and mmap(PROT_READ) when
[...]
So the goal at that time was to map/open both in RO mode, afaiu. So one
Good point. And you can have both with "share=on,readonly=on" ever since
Stefan introduced that flag, which that patch enabled.
Stefan didn't go into details to describe the required interactions
between MAP_PRIVATE / MAP_SHARED.
[...]
-object memory-backend-file,share=off,readonly=on
BUT, have the mmap be writable (instead of currently readonly).
Assuming we would change the current behavior, what if someone would
specify:
-object memory-backend-file,readonly=on
(because the default is share=off ...) and using it for a R/O NVDIMM,
where we expect any write access to fail.
It will (as expected), right? fallocate() will just fail on the RO files.
Yes, if the file was opened R/O, any fallocate() will fail.
To be explicit, maybe we should just remember the readonly attribute for a
ramblock and then we can even provide a more meaningful error log, like:
Hah, I have a patch that adds RAM_READONLY :) . But it expresses
"mmapped shared" not "file opened shared".
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 3df73542e1..f8c11c8d54 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -3424,9 +3424,13 @@ int qemu_ram_foreach_block(RAMBlockIterFunc func, void
*opaque)
int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length)
{
int ret = -1;
-
uint8_t *host_startaddr = rb->host + start;
+ if (rb->flags & RAM_READONLY) {
+ // more meaningful error reports (though right now no errp passed in..)
+ return -EACCES;
+ }
Remembering whether a file is opened R/O might be reasonable to improve
the error messages.
[...]
But let's look at the commit that added the "readonly" parameter:
commit 86635aa4e9d627d5142b81c57a33dd1f36627d07
Author: Stefan Hajnoczi <stefa...@redhat.com>
Date: Mon Jan 4 17:13:19 2021 +0000
hostmem-file: add readonly=on|off option
Let -object memory-backend-file work on read-only files when the
readonly=on option is given. This can be used to share the contents of a
file between multiple guests while preventing them from consuming
Copy-on-Write memory if guests dirty the pages, for example.
That was part of
https://lore.kernel.org/all/20210104171320.575838-3-stefa...@redhat.com/T/#m712f995e6dcfdde433958bae5095b145dd0ee640
From what I understand, for NVDIMMs we always use
"-object memory-backend-file,share=on", even when we want a
readonly NVDIMM.
So we have two options:
1) Change the current behavior of -object
memory-backend-file,share=off,readonly=on:
-> Open the file r/o but mmap it writable
2) Add a new property to configure the mmap accessibility. Not a big fan of
that.
@Stefan, do you have any concern when we would do 1) ?
As far as I can tell, we have to set the nvdimm to "unarmed=on" either way:
+ "unarmed" controls the ACPI NFIT NVDIMM Region Mapping Structure "NVDIMM
+ State Flags" Bit 3 indicating that the device is "unarmed" and cannot accept
+ persistent writes. Linux guest drivers set the device to read-only when this
+ bit is present. Set unarmed to on when the memdev has readonly=on.
So changing the behavior would not really break the nvdimm use case.
Further, we could warn in nvdimm code when we stumble over this configuration
with
unarmed=on.
I'll leave nvdimm specific question to Stefan, but isn't this also will map
any readonly=on memory backends (besides nvdimm) to have the memory
writable, which is still unexpected?
Note that libvirt *never* sets readonly=on for R/O NVDIMMs, and R/O
NVDIMMs are really the only use case.
I don't think "open file read only" raises the expectation "map it read
only". It certainly does for shared mappings, but not for private mappings.
To me, the expectation of "open the file read only" is that the file
won't be modified, ever.
Libvirt should probably be taught to use "share=on,readonly=on" when
dealing with R/O NVDIMMs, instead of "share=off" (implying readonly=on)
to create a writable COW mapping.
--
Cheers,
David / dhildenb