Our hardware (UV aka Superdome Flex) has address ranges marked reserved by the BIOS. These ranges can cause the system to halt if accessed.
During kernel initialization, the processor was speculating into reserved memory causing system halts. The processor speculation is enabled because the reserved memory is being mapped by the kernel. The page table level2_kernel_pgt is 1 GiB in size, and had all pages initially marked as valid, and the kernel is placed anywhere in this range depending on the virtual address selected by KASLR. Later on in the boot process, the valid area gets trimmed back to the space occupied by the kernel. But during the interval of time when the full 1 GiB space was marked as valid, if the kernel physical address chosen by KASLR was close enough to our reserved memory regions, the valid pages outside the actual kernel space were allowing the processor to issue speculative accesses to the reserved space, causing the system to halt. This was encountered somewhat rarely on a normal system boot, and somewhat more often when starting the crash kernel if "crashkernel=512M,high" was specified on the command line (because this heavily restricts the physical address of the crash kernel, usually to within 1 GiB of our reserved space). The answer is to invalidate the pages of this table outside the address range occupied by the kernel before the page table is activated. This patch has been validated to fix this problem on our hardware. Signed-off-by: Steve Wahl <steve.w...@hpe.com> Cc: sta...@vger.kernel.org --- arch/x86/kernel/head64.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index 29ffa495bd1c..31f89a5defa3 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -225,10 +225,15 @@ unsigned long __head __startup_64(unsigned long physaddr, */ pmd = fixup_pointer(level2_kernel_pgt, physaddr); - for (i = 0; i < PTRS_PER_PMD; i++) { + for (i = 0; i < pmd_index((unsigned long)_text); i++) + pmd[i] &= ~_PAGE_PRESENT; + + for (; i <= pmd_index((unsigned long)_end); i++) if (pmd[i] & _PAGE_PRESENT) pmd[i] += load_delta; - } + + for (; i < PTRS_PER_PMD; i++) + pmd[i] &= ~_PAGE_PRESENT; /* * Fixup phys_base - remove the memory encryption mask to obtain -- 2.21.0 -- Steve Wahl, Hewlett Packard Enterprise