On Wed, Jan 29, 2025 at 01:20:13PM -0500, Steven Sistare wrote: > On 1/17/2025 6:57 PM, Peter Xu wrote: > > On Fri, Jan 17, 2025 at 02:10:14PM -0500, Steven Sistare wrote: > > > On 1/17/2025 1:16 PM, Peter Xu wrote: > > > > On Fri, Jan 17, 2025 at 09:46:11AM -0800, Steve Sistare wrote: > > > > > +/* > > > > > + * Return true if ram contents would be lost during CPR. > > > > > + * Return false for ram_device because it is remapped in new QEMU. > > > > > Do not > > > > > + * exclude rom, even though it is readonly, because the rom file > > > > > could change > > > > > + * in new QEMU. Return false for non-migratable blocks. They are > > > > > either > > > > > + * re-created in new QEMU, or are handled specially, or are covered > > > > > by a > > > > > + * device-level CPR blocker. Return false for an fd, because it is > > > > > visible and > > > > > + * can be remapped in new QEMU. > > > > > + */ > > > > > +static bool ram_is_volatile(RAMBlock *rb) > > > > > +{ > > > > > + MemoryRegion *mr = rb->mr; > > > > > + > > > > > + return mr && > > > > > + memory_region_is_ram(mr) && > > > > > + !memory_region_is_ram_device(mr) && > > > > > + (!qemu_ram_is_shared(rb) || !qemu_ram_is_named_file(rb)) && > > > > > + qemu_ram_is_migratable(rb) && > > > > > + rb->fd < 0; > > > > > +} > > > > > > > > Blocking guest_memfd looks ok, but comparing to add one more block > > > > notifier, can we check all ramblocks once in migrate_prepare(), and fail > > > > that command directly if it fails the check? > > > > > > In an upcoming patch, I will be adding an option analogous to > > > only-migratable which > > > prevents QEMU from starting if anything would block cpr-transfer. That > > > option > > > will be checked when blockers are added, like for only-migratable. > > > migrate_prepare > > > is too late. > > > > > > > OTOH, is there any simpler way to simplify the check conditions? It'll > > > > be > > > > at least nice to break these checks into smaller if conditions for > > > > readability.. > > > > > > I thought the function header comments made it clear, but I could move > > > each > > > comment next to each condition: > > > > > > ... > > > /* > > > * Return false for an fd, because it is visible and can be remapped > > > in > > > * new QEMU. > > > */ > > > if (rb->fd >= 0) { > > > return false; > > > } > > > ... > > > > > > > I wonder if we could stick with looping over all ramblocks, then make > > > > sure > > > > each of them is on the cpr saved fd list. It may need to make > > > > cpr_save_fd() always register with the name of ramblock to do such > > > > lookup, > > > > or maybe we could also cache the ramblock pointer in CprFd, then the > > > > lookup > > > > will be a pointer match check. > > > > > > Some ramblocks are not on the list, such as named files. Plus looping in > > > migrate_prepare is too late as noted above. > > > > > > IMO what I have already implemented using blockers is clean and elegant. > > > > OK if we need to fail it early at boot, then yes blockers are probably > > better. > > > > We'll need one more cmdline parameter. I've no objection, but I don't know > > how to judge when it's ok to add, when it's better not.. I'll leave others > > to comment on this. > > > > But still, could we check it when ramblocks are created? So in that way > > whatever is forbidden is clear in its own path, I feel like that could be > > clearer (like what you did with gmemfd). > > When the ramblock is created, we don't yet know if it is migratable. A > ramblock that is not migratable does not block cpr. Migratable is not known > until vmstate_register_ram calls qemu_ram_set_migratable. Hence that is > where I evaluate conditions and install a blocker. > > Because that is the only place where ram_block_add_cpr_blocker is called, > the test qemu_ram_is_migratable() inside ram_block_add_cpr_blocker is > redundant, and I should delete it.
Hmm.. sounds reasonable. > > > For example, if I start to convert some of your requirements above, then > > memory_region_is_ram_device() implies RAM_PREALLOC. Actually, ram_device > > is not the only RAM_PREALLOC user.. Say, would it also not work with all > > memory_region_init_ram_ptr() users (even if they're not ram_device)? An > > example is, looks like virtio-gpu can create random ramblocks on the fly > > with prealloced buffers. I am not sure whether they can be pinned by VFIO > > too. You may know better. > > That memory is not visible to the guest. It is not part of system_memory, > and is not marked migratable. I _think_ that can still be visible at least for the virtio-gpu use case, which hangs under VirtIOGPUBase.hostmem. Relevant code for reference: virtio_gpu_virgl_map_resource_blob: memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data); memory_region_add_subregion(&b->hostmem, offset, mr); memory_region_set_enabled(mr, true); virtio_gpu_pci_base_realize: memory_region_init(&g->hostmem, OBJECT(g), "virtio-gpu-hostmem", g->conf.hostmem); pci_register_bar(&vpci_dev->pci_dev, 4, PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_PREFETCH | PCI_BASE_ADDRESS_MEM_TYPE_64, &g->hostmem); pci_update_mappings: memory_region_add_subregion_overlap(r->address_space, r->addr, r->memory, 1); but indeed I'm not sure how it work with migration so far, because it doesn't have RAM_MIGRATABLE set. So at least cpr didn't make it more special. I assume this isn't something we must figure out as of now then.. but if you do, please kindly share. Thanks, -- Peter Xu