Aneesh Kumar K.V's on January 8, 2020 3:44 pm: > Without this kernel can endup with SLB entries as below > > 04 c00c000008000000 00066bde000a7510 256M ESID=c00c00000 VSID= 66bde000a7 > LLP:110 > 12 0000000008000000 00066bde000a7d90 256M ESID= 0 VSID= 66bde000a7 > LLP:110 > > Such SLB entries can result in machine check. > > We noticed this with 256MB segments because that resulted in the duplicate > VSID > with first vmemmap segment and first user segement. With 1TB segments we > observe > duplication with EAs like > > 0x100e64b vsid for EA 0xc00db50000000000 context 7 > 0x100e64b vsid for user EA 0x1b50000000000 context 7 > > and those high addresses are not common and the kernel mapping in the above > case > is I/O remap range. > > [ 0.000000] vmalloc start = 0xc008000000000000 > [ 0.000000] IO start = 0xc00a000000000000 > [ 0.000000] vmemmap start = 0xc00c000000000000 > > Fixes: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the > same 0xc range") > Signed-off-by: Aneesh Kumar K.V <aneesh.ku...@linux.ibm.com> > --- > arch/powerpc/include/asm/book3s/64/mmu-hash.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h > b/arch/powerpc/include/asm/book3s/64/mmu-hash.h > index 15b75005bc34..516db8a2e6ca 100644 > --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h > +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h > @@ -601,7 +601,7 @@ extern void slb_set_size(u16 size); > */ > #define MAX_USER_CONTEXT ((ASM_CONST(1) << CONTEXT_BITS) - 2) > #define MIN_USER_CONTEXT (MAX_KERNEL_CTX_CNT + MAX_VMALLOC_CTX_CNT + \ > - MAX_IO_CTX_CNT + MAX_VMEMMAP_CTX_CNT) > + MAX_IO_CTX_CNT + MAX_VMEMMAP_CTX_CNT + 1)
Good find and fix, but the changelog is a bit difficult to read. The bug is an off-by-one error which means the first user context ID allocated is the vmemmap ID, right? I would lead with that. I'm not sure that machine checks are a primary symptom, different ESID mapping the same VSID is allowed. My guess is the machine check happens a little later, after the vmemmap gets corrupted via its new mapping. Guessing this hasn't immediately resulted in wholesale mayhem because - Init is pretty small, doesn't use many segments or pages. - Low 1TB is mapped with 256MB segments which get a different VA hash than the 1TB vmemmap segment. - init tends to load itself at 256MB, so even with disable_1tb_segments, it's clashing with the second vmmemap segment, which is for like the second 256GB of memory on the first node, so not going to hit many systems. Anyway good find. Reviewed-by: Nicholas Piggin <npig...@gmail.com> Thanks, Nick