Michael Ellerman <m...@ellerman.id.au> writes: > Joel reported weird crashes using skiroot_defconfig, in his case we > jumped into an NX page: > > kernel tried to execute exec-protected page (c000000002bff4f0) - exploit > attempt? (uid: 0) > BUG: Unable to handle kernel instruction fetch > Faulting instruction address: 0xc000000002bff4f0 > > Looking at the disassembly, we had simply branched to that address: > > c000000000c001bc 49fff335 bl c000000002bff4f0 > > But that didn't match the original kernel image: > > c000000000c001bc 4bfff335 bl c000000000bff4f0 <kobject_get+0x8> > > When STRICT_KERNEL_RWX is enabled, and we're using the radix MMU, we > call radix__change_memory_range() late in boot to change page > protections. We do that both to mark rodata read only and also to mark > init text no-execute. That involves walking the kernel page tables, > and clearing _PAGE_WRITE or _PAGE_EXEC respectively. > > With radix we may use hugepages for the linear mapping, so the code in > radix__change_memory_range() uses eg. pmd_huge() to test if it has > found a huge mapping, and if so it stops the page table walk and > changes the PMD permissions. > > However if the kernel is built without HUGETLBFS support, pmd_huge() > is just a #define that always returns 0. That causes the code in > radix__change_memory_range() to incorrectly interpret the PMD value as > a pointer to a PTE page rather than as a PTE at the PMD level. > > Unfortunately the combination of _PAGE_PTE and _PAGE_PRESENT in the > high bits of the PMD entry give us 0xc in the top nibble which means > the PMD entry happens to look like a valid pointer into the linear > mapping.
Aneesh points out this paragraph is confusing. Those bits only make the pointer *look* valid, it doesn't actually make any difference to the code, as we mask and re-add those bits anyway. I'll drop this paragraph when committing. cheers