Le 15/04/2020 à 07:11, Christopher M Riedl a écrit :
On March 24, 2020 11:25 AM Christophe Leroy <christophe.le...@c-s.fr> wrote:
Le 23/03/2020 à 05:52, Christopher M. Riedl a écrit :
Currently, code patching a STRICT_KERNEL_RWX exposes the temporary
mappings to other CPUs. These mappings should be kept local to the CPU
doing the patching. Use the pre-initialized temporary mm and patching
address for this purpose. Also add a check after patching to ensure the
patch succeeded.
Based on x86 implementation:
commit b3fd8e83ada0
("x86/alternatives: Use temporary mm for text poking")
Signed-off-by: Christopher M. Riedl <c...@informatik.wtf>
---
arch/powerpc/lib/code-patching.c | 128 ++++++++++++++-----------------
1 file changed, 57 insertions(+), 71 deletions(-)
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 18b88ecfc5a8..f156132e8975 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -19,6 +19,7 @@
#include <asm/page.h>
#include <asm/code-patching.h>
#include <asm/setup.h>
+#include <asm/mmu_context.h>
static int __patch_instruction(unsigned int *exec_addr, unsigned int instr,
unsigned int *patch_addr)
@@ -65,99 +66,79 @@ void __init poking_init(void)
pte_unmap_unlock(ptep, ptl);
}
-static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
-
-static int text_area_cpu_up(unsigned int cpu)
-{
- struct vm_struct *area;
-
- area = get_vm_area(PAGE_SIZE, VM_ALLOC);
- if (!area) {
- WARN_ONCE(1, "Failed to create text area for cpu %d\n",
- cpu);
- return -1;
- }
- this_cpu_write(text_poke_area, area);
-
- return 0;
-}
-
-static int text_area_cpu_down(unsigned int cpu)
-{
- free_vm_area(this_cpu_read(text_poke_area));
- return 0;
-}
-
-/*
- * Run as a late init call. This allows all the boot time patching to be done
- * simply by patching the code, and then we're called here prior to
- * mark_rodata_ro(), which happens after all init calls are run. Although
- * BUG_ON() is rude, in this case it should only happen if ENOMEM, and we judge
- * it as being preferable to a kernel that will crash later when someone tries
- * to use patch_instruction().
- */
-static int __init setup_text_poke_area(void)
-{
- BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
- "powerpc/text_poke:online", text_area_cpu_up,
- text_area_cpu_down));
-
- return 0;
-}
-late_initcall(setup_text_poke_area);
+struct patch_mapping {
+ spinlock_t *ptl; /* for protecting pte table */
+ struct temp_mm temp_mm;
+};
/*
* This can be called for kernel text or a module.
*/
-static int map_patch_area(void *addr, unsigned long text_poke_addr)
+static int map_patch(const void *addr, struct patch_mapping *patch_mapping)
Why change the name ?
It's not really an "area" anymore.
{
- unsigned long pfn;
- int err;
+ struct page *page;
+ pte_t pte, *ptep;
+ pgprot_t pgprot;
if (is_vmalloc_addr(addr))
- pfn = vmalloc_to_pfn(addr);
+ page = vmalloc_to_page(addr);
else
- pfn = __pa_symbol(addr) >> PAGE_SHIFT;
+ page = virt_to_page(addr);
- err = map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT), PAGE_KERNEL);
+ if (radix_enabled())
+ pgprot = __pgprot(pgprot_val(PAGE_KERNEL));
+ else
+ pgprot = PAGE_SHARED;
Can you explain the difference between radix and non radix ?
Why PAGE_KERNEL for a page that is mapped in userspace ?
Why do you need to do __pgprot(pgprot_val(PAGE_KERNEL)) instead of just
using PAGE_KERNEL ?
On hash there is a manual check which prevents setting _PAGE_PRIVILEGED for
kernel to userspace access in __hash_page - hence we cannot access the mapping
if the page is mapped PAGE_KERNEL on hash. However, I would like to use
PAGE_KERNEL here as well and am working on understanding why this check is
done in hash and if this can change. On radix this works just fine.
The page is mapped PAGE_KERNEL because the address is technically a userspace
address - but only to keep the mapping local to this CPU doing the patching.
PAGE_KERNEL makes it clear both in intent and protection that this is a kernel
mapping.
I think the correct way is pgprot_val(PAGE_KERNEL) since PAGE_KERNEL is defined
as:
#define PAGE_KERNEL __pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
and __pgprot() is defined as:
typedef struct { unsigned long pgprot; } pgprot_t;
#define pgprot_val(x) ((x).pgprot)
#define __pgprot(x) ((pgprot_t) { (x) })
Yes, so:
pgprot_val(__pgprot(x)) == x
You do:
pgprot = __pgprot(pgprot_val(PAGE_KERNEL));
Which is:
pgprot = __pgprot(pgprot_val(__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)));
Which is equivalent to:
pgprot = __pgprot(_PAGE_BASE | _PAGE_KERNEL_RW);
So at the end it should simply be:
pgprot = PAGE_KERNEL;
Christophe