On Fri, Feb 25, 2022 at 12:36:24PM +0000, Joao Martins wrote: > I am trying to approach this iteratively and starting by fixing AMD 1T+ guests > with something that hopefully is less painful to bear and unbreaks users doing > multi-TB guests on kernels >= 5.4. While for < 5.4 it would not wrongly be > DMA mapping bad IOVAs that may lead guests own spurious failures. > For the longterm, qemu would need some sort of handling of configurable a > sparse > map of all guest RAM which currently does not exist (and it's stuffed inside > on a > per-machine basis as you're aware). What I am unsure is the churn associated > with it (compat, migration, mem-hotplug, nvdimms, memory-backends) versus > benefit > if it's "just" one class of x86 platforms (Intel not affected) -- which is > what I find > attractive with the past 2 revisions via smaller change.
Right. I pondered this for a while and I wonder whether you considered making this depend on the guest cpu vendor and max phys bits. Things are easier to debug if the memory map is the same whatever the host. The guest vendor typically matches the host cpu vendor after all, and there just could be guests avoiding the reserved memory ranges on principle. We'll need a bunch of code comments explaining all this hackery, as well as machine type compat things, but that is par for the course. Additionally, we could have a host check and then fail to init vdpa and vfio devices if the memory map will make some memory inaccessible. Does this sound reasonable to others? Alex? Joao? -- MST