On Wed, Mar 26, 2025 at 10:40:29PM +0000, Pratyush Yadav wrote: > Ideally, kho_preserve_folio() should be similar to freeing the folio, > except that it doesn't go to buddy for re-allocation. In that case, > re-using those pages should not be a problem as long as the driver made > sure the page was properly "freed", and there are no stale references to > it. They should be doing that anyway since they should make sure the > file doesn't change after it has been serialized.
I don't know if this is a good idea, it seems to make error recovery much more complex. > > Then you have the issue that I don't actually imagine shutting down > > something like iommufd, I was intending to leave it frozen in place > > with all its allocations and so on. If you try to de-serialize you > > can't de-serialize into the thing that is frozen, you'd create a new > > one from empty. Now you have two things pointing at the same stuff, > > what a mess. > > What do you mean by "frozen in place"? Isn't that the same as being > serialized? I mean all the memory and internal state is still there, it is just not changing. It is not the same as being serialized, as the de-serialized versions of everything would still exist in parallel. > Considering that we want to make sure a file is not opened by any > process before we serialize it, what do we get by keeping the struct > file around (assuming we can safely deserialize it without going > through kexec)? We do alot less work. Having serialize reliably but the entire system into a fully post-live-update state, including dependent things like the iommufd/vfio attachment and iommu driver, is very hard. This stuff is quite complex. I imagine instead we have three data states - Fully operating - Frozen and all preserved memory logged in KHO - post-live-update where there are hints scattered around the drivers about what is in the KHO >From an error prespective going from frozen back to fully operating should just be throwing away the KHO record and allowing use of the FD again. That is super simply and makes error recovery during micro-steps of the KHO simple and safe. If you imagine that KHO is destructive then every failure point needs to unwind the partial destruction which is a total nightmare to code :\ > Main idea is for logical grouping and dependency management. If some FDs > have a dependency between them, grouping them in different boxes makes > it easy to let userspace choose the order of operations, but still have > a way to make sure all dependencies are met when the FDs are serialized. > Similarly, on the deserialize side, this ensures that all dependent FDs > are deserialized together. That seems over complicated to me. Userspace should write the FDs in the required order and that should be a topological sort of the required dependencies. kernel should just validate this was done. Jason