At 2023-07-26 16:11:44, "David Hildenbrand" <da...@redhat.com> wrote:
>
>> though the file never gets written. (the actual memory file & guest state 
>> file require
>> separated hacking)
>> 
>> And at least the patch provided here have been the solution to this last 
>> problem for me
>> for a while.
>> 
>> By the way the commit: "Commit 134253a4, machine: do not crash if default 
>> RAM backend name
>> has been stolen" disallows me to use a memory backed file directly as pc.ram 
>> and make
>> `-object memory-backed-file,*` based setup more complex (I cannot easily 
>> make the memory
>
>Can't you simply do
>
>-object memory-backed-file,id=mem1 \
>-machine q35,memory-backend=mem1,share=off \
>
>Or what would be the problem with that?

I could do this though I have not tested this out. For me the biggest problem is
that machines who uses a custom memory backend are likely not compatible with
those who uses the default memory backend, and this is not convenient -
why they are different types of machines and does not migrate
from one to another ...

My hack is mostly like using `-machine q35` when making snapshot, and
`-machine q35 -mem-path filename4pc.ram` when I like to provide a separated
memory file on recovery and `-machine q35` when I do not like to provide a
memory file.

I prefer flexibility when recovering from snapshot that I can either provide a
guest state file containing everything, trading memory initialization cost for
convenience and disk usage (memory file is and can only be uncompressed),
or, provide a memory file + guest state file that focuses more on memory
performance.

>> 
>>>
>>> While it doesn't work "in the general case", it works in the "single file 
>>> owner" case
>>> where someone simply forgot to specify "share=on" -- "share=off" is the 
>>> default for
>>> memory-backend-file :( .
>>>
>>>
>>> For example, with hugetlb+virtio-mem the following works if the file does 
>>> not exists:
>>>
>>> (note that virtio-mem will fallocate(FALLOC_FL_PUNCH_HOLE) the whole file 
>>> upfront)
>>>
>>> ...
>>> -object 
>>> memory-backend-file,share=off,mem-path=/dev/hugepages/vmem0,id=mem2,size=2G 
>>> \
>>> -device virtio-mem-pci,id=vmem0,memdev=mem2,requested-size=1g,bus=root
>>>
>>>
>>> With you patch, once the file already exists, we would now get
>>>
>>> qemu-system-x86_64: -device
>> virtio-mem-pci,id=vmem0,memdev=mem2,requested-size=1g,bus=root: 
>> ram_block_discard_range:
>> Failed to fallocate :0 +80000000 (-9)
>>> qemu-system-x86_64: -device
>> virtio-mem-pci,id=vmem0,memdev=mem2,requested-size=1g,bus=root: Unexpected 
>> error
>> discarding RAM: Bad file descriptor
>>>
>>>
>>> So this has the potential to break existing setups.
>>>
>>> The easy fix for these would be to configure "share=on" in these 
>>> now-failing setups. Hmmmmm ....
>> 
>> I am afraid that the easiest prefix could be to configure `share=on` when 
>> the path starts
>> with "/dev/huge" while firing a warning :D
>> 
>> I am sorry about that if existing systems will be broken because of my patch 
>> ...
>> 
>> I have learnt that mem-path commonly refer to hugetlb/hugepage, but actually 
>> I have no
>> idea what is the outcome if hugetlb or anything similar was mapped with 
>> map_private and
>> copy-on-write happens - will a whole huge page be copied on write then?
>> 
>> I would suppose that in reality system managers may consider directly remove 
>> the file
>> first if the file will be truncated anyway. However t would be a different 
>> story if this
>> file should be truncated exactly PARTIALLY.
>> 
>> Alternatively maybe another flag "create=on" can be added when private 
>> semantics are
>> required, so that if the file exists, the file should be unlinked or 
>> truncated first
>> before using?
>> 
>> Since I am nowhere familiar to this part of qemu source code, it will be 
>> hard for me to
>> write the additional command line flag part correct, if this is believed to 
>> be the correct
>> solution though.
>> 
>> In summary I am glad to learn more of the backgrounds here.
>
>The easiest way not break any existing setup would be to open the file 
>R/O only if opening it R/W failed due to lack of permissions, and we 
>have a private mapping. So, in case of !RAMP_SHARED, simply retry once 
>more without write permissions.
>
>Would that keep your use-case working?
>

I believe yes. When I want to make sure that memory file is not changed, I only
need to make sure that the file is read only on the filesystem level, either by
readonly mounting or readonly file modes.

Though this is not perfect - it means when I accidentally make the memory file
writeable, qemu will open it with readwrite mode. This is a bit scary and
counter-intuitive.

To think of it, this solution seems most likely to be accepted since it works
most of the time, no matter how the other parts of the code changes. It does
not even related to whether the memory pages are shared or anything similar.

I will give this a shot later.

As I have learnt in the messages (though I feel I do not fully understand
postcopy live migration mechanism), I can feel that the problem behind the
decision of shared/private readonly/readwrite and whether to punch holes, are
convoluted and might take months to solve ...

>> 
>> Back to `-mem-path` part. Now I wonder whether filling the initial value for 
>> ram is what
>> `-mem-path` is expected behavior (whether I am using a feature that will be 
>> deprecated
>> soon); whether there is a convenient method to filling the initial value in 
>> copy-on-write
>> styles if `mem-path` is not good to use; and in general whether a privately 
>> used memory
>> backed file SHOULD be writeable.
>
>The case that "-mem-path" has always used MAP_PRIVATE and seemed to have 
>worked with postcopy live migration (another user of 
>fallocate(PUNCH_HOLE)) tells me that we should be careful about that.
>

Ha, I did not expect that the current mem path behavior is not fully by
design. :D

--

Regards,

logoerthiner

Reply via email to