On 17/06/2016 15:18, Eduardo Habkost wrote: > On Fri, Jun 17, 2016 at 09:15:06AM +0100, Dr. David Alan Gilbert wrote: >> * Eduardo Habkost (ehabk...@redhat.com) wrote: >>> On Thu, Jun 16, 2016 at 06:12:12PM +0100, Dr. David Alan Gilbert (git) >>> wrote: >>>> From: "Dr. David Alan Gilbert" <dgilb...@redhat.com> >>>> >>>> Currently QEMU sets the x86 number of physical address bits to the >>>> magic number 40. This is only correct on some small AMD systems; >>>> Intel systems tend to have 36, 39, 46 bits, and large AMD systems >>>> tend to have 48. >>>> >>>> Having the value different from your actual hardware is detectable >>>> by the guest and in principal can cause problems; >>> >>> What kind of problems? >>> >>> Is it a problem to have something smaller from the actual >>> hardware, or just if it's higher? >> >> I'm a bit vague on the failure cases; but my understanding of the two >> cases are; >> >> Larger is a problem if the guest tries to map something to a high >> address that's not addressable.
(Note: this is a problem when migrating to hosts with _smaller_ phys-bits) >> Smaller is potentially a problem if the guest plays tricks with >> what it thinks are spare bits in page tables but which are actually >> interpreted. I believe KVM plays a trick like this. (Note: this is a problem when migrating to hosts with _larger_ phys-bits) > If both smaller and larger are a problem, we have a much bigger > problem than we thought. We need to confirm this. > > So, what happens if the guest play tricks in bits 40-45 when QEMU > sets the limit to 40 but we are running in a 46-bit host? Is it > really a problem? I assumed it would be safe. The guest expects a "reserved bit set" page fault, but doesn't get one. >> 2) While we have maxmem settings to tell us the top of VM RAM, do >> we have anything that tells us the top of IO space? What happens >> when we hotplug a PCI card? > > (CCing Marcel and Michael, as we were discussing this recently.) > > That's a good question. When calculating how many bits the > machine requires, machine code could choose to reserve a > reasonable amount of space for hotplug by default. > > Whatever we choose as the default, in some corner cases (e.g. > almost-32GB VMs running in a 39-bit host) we will still need to > let the user choose between having extra space for hotplug and > being able to safely migrate to 36-bit hosts. No, this is not possible unfortunately. If you set phys-bits < host-phys-bits, the guest may expect some bits to be reserved, when they actually aren't. In practice this doesn't happen for the reason I mentioned in my other message (tl;dr: 1-the trick is rarely used though KVM uses it, 2-if they use bit 51 they're safe in practice). But still making phys-bits smaller than host-phys-bits is a bad idea. Making the guest's phys-bits larger than host-phys-bits would be okay if you reserve the area in the e820 and assume the guest doesn't touch it. But it is not a great idea too, because e820 describes RAM, so you're telling the guest "look, there's 64 TB of reserved RAM up there". >> 3) Is it better to stick to sizes that correspond to real hardware >> if you can? For example I don't know of any machines with 37 bits >> - in practice I think it's best to stick with sizes that correspond >> to some real hardware. > > Yeah, "as small as possible" could be actually "the smallest > possible value from a set of known-to-exist values". e.g. if we > find out that we need 37 bits, it's probably better to simply use > 39 bits. > > Choosing from a smaller set of values also makes corner cases > (like the example above) less likely to happen. Not really, because any value that doesn't match the host is problematic, albeit in different ways. Paolo