On Fri, Jun 17, 2016 at 09:15:06AM +0100, Dr. David Alan Gilbert wrote: > * Eduardo Habkost (ehabk...@redhat.com) wrote: > > On Thu, Jun 16, 2016 at 06:12:12PM +0100, Dr. David Alan Gilbert (git) > > wrote: > > > From: "Dr. David Alan Gilbert" <dgilb...@redhat.com> > > > > > > Currently QEMU sets the x86 number of physical address bits to the > > > magic number 40. This is only correct on some small AMD systems; > > > Intel systems tend to have 36, 39, 46 bits, and large AMD systems > > > tend to have 48. > > > > > > Having the value different from your actual hardware is detectable > > > by the guest and in principal can cause problems; > > > > What kind of problems? > > > > Is it a problem to have something smaller from the actual > > hardware, or just if it's higher? > > I'm a bit vague on the failure cases; but my understanding of the two > cases are; > > Larger is a problem if the guest tries to map something to a high > address that's not addressable. > > Smaller is potentially a problem if the guest plays tricks with > what it thinks are spare bits in page tables but which are actually > interpreted. I believe KVM plays a trick like this.
If both smaller and larger are a problem, we have a much bigger problem than we thought. We need to confirm this. So, what happens if the guest play tricks in bits 40-45 when QEMU sets the limit to 40 but we are running in a 46-bit host? Is it really a problem? I assumed it would be safe. > > > > The current limit of 40 stops TB VMs being created by those lucky > > > enough to have that much. > > > > > > This patch lets you set the physical bits by a cpu property but > > > defaults to the same existing magic 40. > > > > > > > The existing 40-bit default looks like a problem for 36-bit > > systems. Do you know what kind of systems have 36 bits only? Only > > old ones, or recent ones too? If only old ones, how old? > > My Sandy Bridge (~2.5 year old) laptop is 36 bits; I've seen > some other single-socket 39bit machines (Ivy bridge and I think newer). > > > Can't we have a new default that is as small as possible for the > > VM RAM+devices configuration? > > That would be good to have, however: > 1) I didn't want to change the default upstream behaviour at first, > this time through I just wanted a way to set it. Agreed. We can change the default after we introduce the mechanisms to handle the existing defaults properly. > 2) While we have maxmem settings to tell us the top of VM RAM, do > we have anything that tells us the top of IO space? What happens > when we hotplug a PCI card? (CCing Marcel and Michael, as we were discussing this recently.) That's a good question. When calculating how many bits the machine requires, machine code could choose to reserve a reasonable amount of space for hotplug by default. Whatever we choose as the default, in some corner cases (e.g. almost-32GB VMs running in a 39-bit host) we will still need to let the user choose between having extra space for hotplug and being able to safely migrate to 36-bit hosts. This can be solved by setting x86_64-cpu.phys-bits, but isn't it too low level? We could have a higher level mechanism in the PC machine class (e.g. a min-extra-phys-addr-space-for-hotplug property). > 3) Is it better to stick to sizes that correspond to real hardware > if you can? For example I don't know of any machines with 37 bits > - in practice I think it's best to stick with sizes that correspond > to some real hardware. Yeah, "as small as possible" could be actually "the smallest possible value from a set of known-to-exist values". e.g. if we find out that we need 37 bits, it's probably better to simply use 39 bits. Choosing from a smaller set of values also makes corner cases (like the example above) less likely to happen. > > > > I've removed the ancient warning about the 42 bit limit in exec.c; > > > I can't find that limit in there and no one else seems to know where > > > it is. > > > > > > Signed-off-by: Dr. David Alan Gilbert <dgilb...@redhat.com> > > > --- > > > target-i386/cpu.c | 8 +++++--- > > > target-i386/cpu.h | 3 +++ > > > 2 files changed, 8 insertions(+), 3 deletions(-) > > > > > > diff --git a/target-i386/cpu.c b/target-i386/cpu.c > > > index 4111240..c3bbf8e 100644 > > > --- a/target-i386/cpu.c > > > +++ b/target-i386/cpu.c > > > @@ -2606,9 +2606,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t > > > index, uint32_t count, > > > /* virtual & phys address size in low 2 bytes. */ > > > /* XXX: This value must match the one used in the MMU code. */ > > > > Do you know where's the MMU code mentioned here? > > No! > > > > if (env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_LM) { > > > - /* 64 bit processor */ > > > -/* XXX: The physical address space is limited to 42 bits in exec.c. */ > > > - *eax = 0x00003028; /* 48 bits virtual, 40 bits physical */ > > > + /* 64 bit processor, 48 bits virtual, configurable > > > + * physical bits. > > > + */ > > > + *eax = 0x00003000 + cpu->phys_bits; > > > > We should reject the configuration if phys-bits is set to > > something larger than the host's phys_bits, shouldn't we? > > > > Maybe we can't do that on old machine-types that already have the > > 40-bit default, but if we have a new reasonable default based on > > VM size, we can be more strict. > > See notes above; but my worry with rejecting stuff is I don't > know how many thousands of machines are already relying on it > in farms with a mix of single and multi socket hosts. That's true. Maybe we should just print a warning instead. -- Eduardo