physmem: fallback to opening guest RAM file as readonly in a MAP_PRIVATE mapping

David Hildenbrand Wed, 09 Aug 2023 02:21:13 -0700

Hi Peter!

-    fd = file_ram_open(mem_path, memory_region_name(mr), readonly, &created,
-                       errp);
+    fd = file_ram_open(mem_path, memory_region_name(mr), readonly, &created);
+    if (fd == -EACCES && !(ram_flags & RAM_SHARED) && !readonly) {
+        /*
+         * We can have a writable MAP_PRIVATE mapping of a readonly file.
+         * However, some operations like ftruncate() or fallocate() might fail
+         * later, let's warn the user.
+         */
+        fd = file_ram_open(mem_path, memory_region_name(mr), true, &created);
+        if (fd >= 0) {
+            warn_report("backing store %s for guest RAM (MAP_PRIVATE) opened"
+                        " readonly because the file is not writable", 
mem_path);


I can understand the use case, but this will be slightly unwanted,
especially the user doesn't yet have a way to predict when will it happen.

Users can set the file permissions accordingly I guess. If they don'twant the file to never ever be modified via QEMU, set it R/O.


Meanwhile this changes the behavior, is it a concern that someone may want
to rely on current behavior of failing?

The scenario would be that someone passes a readonly file to "-mem-path"or "-object memory-backend-file,share=off,readonly=off", with theexpectation that it would currently fail.


If it now doesn't fail (and we warn instead), what would happen is:
* In file_ram_alloc() we won't even try ftruncate(), because the file
  already had a size > 0. So ftruncate() is not a concern as I now
  realize.
* fallocate might fail later. AFAIKS, that only applies to
  ram_block_discard_range().
 -> virtio-mem performs an initial ram_block_discard_range() check and
    fails gracefully early.
 -> virtio-ballooon ignores any errors
 -> ram_discard_range() in migration code fails early for postcopy in
    init_range() and loadvm_postcopy_ram_handle_discard(), handling it
    gracefully.

So mostly nothing "bad" would happen, it might just be undesirable, andwe properly warn.

Most importantly, we won't be corrupting/touching the original file inany case, because it is R/O.

If we really want to be careful, we could clue that behavior to compatmachines. I'm not really sure yet if we really have to go down that path.

Any other alternatives? I'd like to avoid new flags where not reallyrequired.


To think from a higher level of current use case, the ideal solution seems
to me that if the ram file can be put on a file system that supports CoW
itself (like btrfs), we can snapshot that ram file and make it RW for the
qemu instance. Then here it'll be able to open the file.  We'll be able to
keep the interface working as before, meanwhile it'll work with fallocate
or truncations too I assume.

Would that be better instead of changing QEMU?

As I recently learned, using file-backed VMs (on real ssd/disks, notshmem/hugetlb) is usually undesired, because the dirtied pages willconstantly get rewritten to disk by background writeback threads,eventually resulting in bad performance and SSD wear.

So while using a COW filesystem sounds cleaner in theory, it's notapplicable in practice -- unless one disables any background writeback,which has different side effects because it cannot be configured on aper-file basis.

So for VM templating, it makes sense to capture the guest RAM and storeit in a file, to then use a COW (MAP_PRIVATE) mapping. Using a read-onlyfile makes perfect sense in that scenario IMHO.

[I'm curious at what point a filesystem will actually break COW. if it'swired up to the writenotify infrastructure, it would happen whenactually writing to a page, not at mmap time. I know that filesystemsuse writenotify for lazy allocation of disk blocks on file holes, maybethey also do that for lazy allocation of disk blocks on COW]


Thanks!

--
Cheers,

David / dhildenb

Re: [PATCH v1 1/3] softmmu/physmem: fallback to opening guest RAM file as readonly in a MAP_PRIVATE mapping

Reply via email to