On SGI UV system, kernel casually hang with kaslr enabled. The back trace is:
kernel BUG at arch/x86/mm/init_64.c:311! invalid opcode: 0000 [#1] SMP [...] RIP: 0010:__init_extra_mapping+0x188/0x196 [...] Call Trace: init_extra_mapping_uc+0x13/0x15 map_high+0x67/0x75 map_mmioh_high_uv3+0x20a/0x219 uv_system_init_hub+0x12d9/0x1496 uv_system_init+0x27/0x29 native_smp_prepare_cpus+0x28d/0x2d8 kernel_init_freeable+0xdd/0x253 ? rest_init+0x80/0x80 kernel_init+0xe/0x110 ret_from_fork+0x2c/0x40 The root cause is that SGI UV system needs map its MMIOH region to direct mapping section and the mapping happens in rest_init(). However mm KASLR is done in kernel_randomize_memory() which is much earlier than MMIOH mapping of SGI UV and doesn't count in the MMIOH regions. When kaslr disabled, there are 64TB space for system RAM to do direct mapping. Both system RAM and SGI UV MMIOH region share this 64TB space. With kaslr enabled, mm KASLR only reserves the actual size of system RAM plus 10TB for direct mapping usage. Then later MMIOH mapping of SGI UV could go beyond the upper bound of direct mapping section to step into VMALLOC or VMEMMAP area. Then the BUG_ON() in __init_extra_mapping() will be triggered. E.g on the SGI UV3 machine where this bug is reported , there are two MMIOH regions: [ 1.519001] UV: Map MMIOH0_HI 0xffc00000000 - 0x100000000000 [ 1.523001] UV: Map MMIOH1_HI 0x100000000000 - 0x200000000000 They are [16TB-16G, 16TB) and [16TB, 32TB). On this machine, 512G ram are spread out to 1TB regions. Then above two SGI MMIOH regions also will be mapped into the direct mapping section. To fix it, we need check if it's SGI UV system by calling is_early_uv_system() in kernel_randomize_memory(). If yes, do not adapt the size of the direct mapping section. Do it now. Signed-off-by: Baoquan He <b...@redhat.com> Cc: Thomas Gleixner <t...@linutronix.de> Cc: Ingo Molnar <mi...@redhat.com> Cc: "H. Peter Anvin" <h...@zytor.com> Cc: x...@kernel.org Cc: Thomas Garnier <thgar...@google.com> Cc: Kees Cook <keesc...@chromium.org> Cc: Andrew Morton <a...@linux-foundation.org> Cc: Masahiro Yamada <yamada.masah...@socionext.com> --- arch/x86/mm/kaslr.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c index aed2064..20b0456 100644 --- a/arch/x86/mm/kaslr.c +++ b/arch/x86/mm/kaslr.c @@ -27,6 +27,7 @@ #include <asm/pgtable.h> #include <asm/setup.h> #include <asm/kaslr.h> +#include <asm/uv/uv.h> #include "mm_internal.h" @@ -123,7 +124,7 @@ void __init kernel_randomize_memory(void) CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING; /* Adapt phyiscal memory region size based on available memory */ - if (memory_tb < kaslr_regions[0].size_tb) + if (memory_tb < kaslr_regions[0].size_tb && !is_early_uv_system()) kaslr_regions[0].size_tb = memory_tb; /* Calculate entropy available between regions */ -- 2.5.5