On 04/19/2018 02:58 PM, Cornelia Huck wrote: > On Thu, 19 Apr 2018 14:33:18 +0200 > Igor Mammedov <imamm...@redhat.com> wrote: > >> On Thu, 19 Apr 2018 17:21:23 +1000 >> David Gibson <da...@gibson.dropbear.id.au> wrote: >> >>> If the -mem-path option is set, we attempt to map the guest's RAM from a >>> file in the given path; it's usually used to back guest RAM with hugepages. >>> If we're unable to (e.g. not enough free hugepages) then we fall back to >>> allocating normal anonymous pages. This behaviour can be surprising, but a >>> comment in allocate_system_memory_nonnuma() suggests it's legacy behaviour >>> we can't change. >>> >>> What really isn't ok, though, is that in this case we leave mem_path set. >>> That means functions which attempt to determine the pagesize of main RAM >>> can erroneously think it is hugepage based on the requested path, even >>> though it's not. >>> >>> This is particular bad for the pseries machine type. KVM HV limitations >>> mean the guest can't use pagesizes larger than the host page size used to >>> back RAM. That means that such a fallback, rather than merely giving >>> poorer performance that expected will cause the guest to freeze up early in >>> boot as it attempts to use large page mappings that can't work. >>> >>> This patch addresses the problem by clearing the mem_path variable when we >>> fall back to anonymous pages, meaning that subsequent attempts to >>> determine the RAM page size will get an accurate result. >>> >>> Signed-off-by: David Gibson <da...@gibson.dropbear.id.au> >>> --- >>> numa.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> Paolo et al, as with my earlier patches adding some extensions to the >>> helpers for determining backing page sizes, if there are no objections >>> can I get an ack to merge this via my ppc tree? >>> >>> diff --git a/numa.c b/numa.c >>> index 1116c90af9..78a869e598 100644 >>> --- a/numa.c >>> +++ b/numa.c >>> @@ -469,6 +469,7 @@ static void allocate_system_memory_nonnuma(MemoryRegion >>> *mr, Object *owner, >>> /* Legacy behavior: if allocation failed, fall back to >>> * regular RAM allocation. >>> */ >>> + mem_path = NULL; >>> memory_region_init_ram_nomigrate(mr, owner, name, ram_size, >>> &error_fatal); >>> } >>> #else >> >> mem_path is also used by kvm_s390_apply_cpu_model(), >> and in ccw_init() memory is initialized before CPUs are >> so if QEM was started with -mem-path, then before patch >> created CPU won't have CMM enabled and print warning: >> >> "CMM will not be enabled because it is not compatible with hugetlbfs." >> >> and after patch it might enable CMM if we clear mem_path. >> So question is do we care about this? > > I don't quite remember the cmm semantics here -- Christian?
The CMMA interface does not work on large pages. I think the kernel will react with EFAULT in some cases (cmma migration and others) so qemu will probably fail unexpectedly. But this patch seems to only clear mem-path if we do not allocate at all from hugetlbfs. So things should be ok, no?