Le 05/09/2025 à 08:57, Andrew Donnellan a écrit :
On Thu, 2025-09-04 at 18:33 +0200, Christophe Leroy wrote:
PAGE_KERNEL_TEXT is an old macro that is used to tell kernel whether
kernel text has to be mapped read-only or read-write based on build
time options.

But nowadays, with functionnalities like jump_labels, static links,
etc ... more only less all kernels need to be read-write at some
point, and some combinations of configs failed to work due to
innacurate setting of PAGE_KERNEL_TEXT. On the other hand, today
we have CONFIG_STRICT_KERNEL_RWX which implements a more controlled
access to kernel modifications.

Instead of trying to keep PAGE_KERNEL_TEXT accurate with all
possible options that may imply kernel text modification, always
set kernel text read-write at startup and rely on
CONFIG_STRICT_KERNEL_RWX to provide accurate protection.

Reported-by: Erhard Furtner <erhar...@mailbox.org>
Closes:
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2F342b4120-911c-4723-82ec-d8c9b03a8aef%40mailbox.org%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Ce1df868f94284b06db0508ddec497516%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638926522413828188%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=cqhzYIshhwKJluL2U2ULuNYoQ1CR1ZP0nsl5pb3wHd4%3D&reserved=0
Signed-off-by: Christophe Leroy <christophe.le...@csgroup.eu>

The original issue that Erhard and I were investigating was why the latest
version of the PowerPC page table check series[0] was failing on his G4, when
built as part of a config with many other debugging options enabled.

With further instrumentation, it turns out that this was due to a failed
instruction patch while setting up a jump label for the
page_table_check_disabled static key, which was being checked in
page_table_check_pte_clear(), which was in turn inlined ultimately into
debug_vm_pgtable().

This patch seems to fix the problem, so:

Tested-by: Andrew Donnellan <a...@linux.ibm.com>

But I'm still curious about why I only see the issue when:

   (a) CONFIG_KFENCE=y (even when disabled using kfence.sample_interval=0) -
noting that changing CONFIG_KFENCE doesn't change the definition of
PAGE_KERNEL_TEXT; and

   (b) when the jump label ends up in a __init function (removing __init from
debug_vm_pgtable() and its associated functions, or changing the code in such a
way that the static key check doesn't get inlined, resolves the issue, and
similarly for test_static_call_init() when CONFIG_STATIC_CALL_SELFTEST=y).

I don't understand the mm code well enough to make sense of this.

That makes sense. When CONFIG_KFENCE is selected, only text and rodata are mapped with BATs. Everything else including inittext is mapped with pages. When CONFIG_KFENCE and CONFIG_DEBUG_PAGEALLOC are not selected, we map as much as possible with BATs.

And as you can see below, BATs are mapped with PAGE_KERNEL_X not with PAGE_KERNEL_TEXT.

Everything happen here below:

static unsigned long __init __mmu_mapin_ram(unsigned long base, unsigned long top)
{
        int idx;

        while ((idx = find_free_bat()) != -1 && base != top) {
                unsigned int size = bat_block_size(base, top);

                if (size < 128 << 10)
                        break;
                setbat(idx, PAGE_OFFSET + base, base, size, PAGE_KERNEL_X);
                base += size;
        }

        return base;
}

unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
{
        unsigned long done;
        unsigned long border = (unsigned long)__srwx_boundary - PAGE_OFFSET;
        unsigned long size;

        size = roundup_pow_of_two((unsigned long)_einittext - PAGE_OFFSET);
        setibat(0, PAGE_OFFSET, 0, size, PAGE_KERNEL_X);

        if (debug_pagealloc_enabled_or_kfence()) {
                pr_debug_once("Read-Write memory mapped without BATs\n");
                if (base >= border)
                        return base;
                if (top >= border)
                        top = border;
        }

        if (!strict_kernel_rwx_enabled() || base >= border || top <= border)
                return __mmu_mapin_ram(base, top);

        done = __mmu_mapin_ram(base, border);
        if (done != border)
                return done;

        return __mmu_mapin_ram(border, top);
}



[0] 
https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2F20250813062614.51759-1-ajd%40linux.ibm.com%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Ce1df868f94284b06db0508ddec497516%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638926522413849910%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=1slIkZ4krf2sWUaKJ%2FayEX8t9dKpfsrDiAxZRohKfRQ%3D&reserved=0



Reply via email to