On 17.08.23 16:41, Peter Xu wrote:
On Thu, Aug 17, 2023 at 11:07:23AM +0200, David Hildenbrand wrote:
@Stefan, see below on a R/O NVDIMM question.

We're discussing how to get MAPR_PRIVATE R/W mapping of a
memory-backend-file running when using R/O files.


This seems a good idea. I am good with the solution you proposed
here as well.

I was just going to get started working on that, when I realized
something important:


"@readonly: if true, the backing file is opened read-only; if false,
  it is opened read-write.  (default: false)"

"@share: if false, the memory is private to QEMU; if true, it is
  shared (default: false)"

So readonly is *all* about the file access mode already ... the mmap()
parameters are just a side-effect of that. Adding a new
"file-access-mode" or similar would be wrong.

Not exactly a side effect, IMHO.  IIUC it's simply because we didn't have a
need of using different perm for memory/file levels.  See the other patch
commit message from Stefan:

https://lore.kernel.org/all/20210104171320.575838-2-stefa...@redhat.com/

     There is currently no way to open(O_RDONLY) and mmap(PROT_READ) when
     [...]

So the goal at that time was to map/open both in RO mode, afaiu.  So one

Good point. And you can have both with "share=on,readonly=on" ever since Stefan introduced that flag, which that patch enabled.

Stefan didn't go into details to describe the required interactions between MAP_PRIVATE / MAP_SHARED.

[...]

-object memory-backend-file,share=off,readonly=on

BUT, have the mmap be writable (instead of currently readonly).

Assuming we would change the current behavior, what if someone would
specify:

-object memory-backend-file,readonly=on

(because the default is share=off ...) and using it for a R/O NVDIMM,
where we expect any write access to fail.

It will (as expected), right?  fallocate() will just fail on the RO files.

Yes, if the file was opened R/O, any fallocate() will fail.


To be explicit, maybe we should just remember the readonly attribute for a
ramblock and then we can even provide a more meaningful error log, like:

Hah, I have a patch that adds RAM_READONLY :) . But it expresses "mmapped shared" not "file opened shared".


diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 3df73542e1..f8c11c8d54 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -3424,9 +3424,13 @@ int qemu_ram_foreach_block(RAMBlockIterFunc func, void 
*opaque)
  int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length)
  {
      int ret = -1;
-
      uint8_t *host_startaddr = rb->host + start;
+ if (rb->flags & RAM_READONLY) {
+        // more meaningful error reports (though right now no errp passed in..)
+        return -EACCES;
+    }

Remembering whether a file is opened R/O might be reasonable to improve the error messages.

[...]


But let's look at the commit that added the "readonly" parameter:

commit 86635aa4e9d627d5142b81c57a33dd1f36627d07
Author: Stefan Hajnoczi <stefa...@redhat.com>
Date:   Mon Jan 4 17:13:19 2021 +0000

     hostmem-file: add readonly=on|off option
     Let -object memory-backend-file work on read-only files when the
     readonly=on option is given. This can be used to share the contents of a
     file between multiple guests while preventing them from consuming
     Copy-on-Write memory if guests dirty the pages, for example.

That was part of

https://lore.kernel.org/all/20210104171320.575838-3-stefa...@redhat.com/T/#m712f995e6dcfdde433958bae5095b145dd0ee640

 From what I understand, for NVDIMMs we always use
"-object memory-backend-file,share=on", even when we want a
readonly NVDIMM.


So we have two options:

1) Change the current behavior of -object 
memory-backend-file,share=off,readonly=on:

-> Open the file r/o but mmap it writable

2) Add a new property to configure the mmap accessibility. Not a big fan of 
that.


@Stefan, do you have any concern when we would do 1) ?

As far as I can tell, we have to set the nvdimm to "unarmed=on" either way:

+   "unarmed" controls the ACPI NFIT NVDIMM Region Mapping Structure "NVDIMM
+   State Flags" Bit 3 indicating that the device is "unarmed" and cannot accept
+   persistent writes. Linux guest drivers set the device to read-only when this
+   bit is present. Set unarmed to on when the memdev has readonly=on.

So changing the behavior would not really break the nvdimm use case.

Further, we could warn in nvdimm code when we stumble over this configuration 
with
unarmed=on.

I'll leave nvdimm specific question to Stefan, but isn't this also will map
any readonly=on memory backends (besides nvdimm) to have the memory
writable, which is still unexpected?

Note that libvirt *never* sets readonly=on for R/O NVDIMMs, and R/O NVDIMMs are really the only use case.

I don't think "open file read only" raises the expectation "map it read only". It certainly does for shared mappings, but not for private mappings.

To me, the expectation of "open the file read only" is that the file won't be modified, ever.

Libvirt should probably be taught to use "share=on,readonly=on" when dealing with R/O NVDIMMs, instead of "share=off" (implying readonly=on) to create a writable COW mapping.

--
Cheers,

David / dhildenb


Reply via email to