On 03/18/2017 at 01:38 AM, Eric W. Biederman wrote: > Xunlei Pang <xlp...@redhat.com> writes: > >> kexec setups identity mappings for all the memory mapped in 1st kernel, >> this is not necessary for the kdump case. Actually it can cause extra >> memory consumption for paging structures, which is quite considerable >> on modern machines with huge memory. >> >> E.g. On our 24TB machine, it will waste around 96MB (around 4MB/TB) >> from the reserved memory range if setting all the identity mappings. >> >> It also causes some trouble for distributions that use an intelligent >> policy to evaluate the proper "crashkernel=X" for users. >> >> To solve it, in case of kdump, we only setup identity mappings for the >> crash memory and the ISA memory(may be needed by purgatory/kdump >> boot). > How about instead we detect the presence of 1GiB pages and use them > if they are available. We already use 2MiB pages. If we can do that > we will only need about 192K for page tables in the case you have > described and this all becomes a non-issue. > > I strongly suspect that the presence of 24TiB of memory in an x86 system > strongly correlates to the presence of 1GiB pages. > > In principle we certainly can use a less extensive mapping but that > should not be something that differs between the two kexec cases.
Ok, will try gbpages for the identity mapping. Regards, Xunlei > I can see forcing the low 1MiB range in. But calling it ISA range is > very wrong and misleading. The reasons that range are special during > boot-up have nothing to do with ISA. But have everything to do with > where legacy page tables are mapped, and where we need identity pages to > start other cpus. I think the only user that actually cares is > purgatory where it plays swapping games with the low 1MiB because we > can't preload what we need to down there or it would mess up the running > kernel. So saying anything about the old ISA bus is wrong and > misleading. At the very very least we need accurate comments. > > Eric > > >> Signed-off-by: Xunlei Pang <xlp...@redhat.com> >> --- >> arch/x86/kernel/machine_kexec_64.c | 34 ++++++++++++++++++++++++++++++---- >> 1 file changed, 30 insertions(+), 4 deletions(-) >> >> diff --git a/arch/x86/kernel/machine_kexec_64.c >> b/arch/x86/kernel/machine_kexec_64.c >> index 857cdbd..db77a76 100644 >> --- a/arch/x86/kernel/machine_kexec_64.c >> +++ b/arch/x86/kernel/machine_kexec_64.c >> @@ -112,14 +112,40 @@ static int init_pgtable(struct kimage *image, unsigned >> long start_pgtable) >> >> level4p = (pgd_t *)__va(start_pgtable); >> clear_page(level4p); >> - for (i = 0; i < nr_pfn_mapped; i++) { >> - mstart = pfn_mapped[i].start << PAGE_SHIFT; >> - mend = pfn_mapped[i].end << PAGE_SHIFT; >> >> + if (image->type == KEXEC_TYPE_CRASH) { >> + /* Always map the ISA range */ >> result = kernel_ident_mapping_init(&info, >> - level4p, mstart, mend); >> + level4p, 0, ISA_END_ADDRESS); >> if (result) >> return result; >> + >> + /* crashk_low_res may not be initialized when reaching here */ >> + if (crashk_low_res.end) { >> + mstart = crashk_low_res.start; >> + mend = crashk_low_res.end + 1; >> + result = kernel_ident_mapping_init(&info, >> + level4p, mstart, mend); >> + if (result) >> + return result; >> + } >> + >> + mstart = crashk_res.start; >> + mend = crashk_res.end + 1; >> + result = kernel_ident_mapping_init(&info, >> + level4p, mstart, mend); >> + if (result) >> + return result; >> + } else { >> + for (i = 0; i < nr_pfn_mapped; i++) { >> + mstart = pfn_mapped[i].start << PAGE_SHIFT; >> + mend = pfn_mapped[i].end << PAGE_SHIFT; >> + >> + result = kernel_ident_mapping_init(&info, >> + level4p, mstart, mend); >> + if (result) >> + return result; >> + } >> } >> >> /*