> On April 15, 2020 3:45 AM Christophe Leroy <christophe.le...@c-s.fr> wrote: > > > Le 15/04/2020 à 07:11, Christopher M Riedl a écrit : > >> On March 24, 2020 11:25 AM Christophe Leroy <christophe.le...@c-s.fr> > >> wrote: > >> > >> > >> Le 23/03/2020 à 05:52, Christopher M. Riedl a écrit : > >>> Currently, code patching a STRICT_KERNEL_RWX exposes the temporary > >>> mappings to other CPUs. These mappings should be kept local to the CPU > >>> doing the patching. Use the pre-initialized temporary mm and patching > >>> address for this purpose. Also add a check after patching to ensure the > >>> patch succeeded. > >>> > >>> Based on x86 implementation: > >>> > >>> commit b3fd8e83ada0 > >>> ("x86/alternatives: Use temporary mm for text poking") > >>> > >>> Signed-off-by: Christopher M. Riedl <c...@informatik.wtf> > >>> --- > >>> arch/powerpc/lib/code-patching.c | 128 ++++++++++++++----------------- > >>> 1 file changed, 57 insertions(+), 71 deletions(-) > >>> > >>> diff --git a/arch/powerpc/lib/code-patching.c > >>> b/arch/powerpc/lib/code-patching.c > >>> index 18b88ecfc5a8..f156132e8975 100644 > >>> --- a/arch/powerpc/lib/code-patching.c > >>> +++ b/arch/powerpc/lib/code-patching.c > >>> @@ -19,6 +19,7 @@ > >>> #include <asm/page.h> > >>> #include <asm/code-patching.h> > >>> #include <asm/setup.h> > >>> +#include <asm/mmu_context.h> > >>> > >>> static int __patch_instruction(unsigned int *exec_addr, unsigned int > >>> instr, > >>> unsigned int *patch_addr) > >>> @@ -65,99 +66,79 @@ void __init poking_init(void) > >>> pte_unmap_unlock(ptep, ptl); > >>> } > >>> > >>> -static DEFINE_PER_CPU(struct vm_struct *, text_poke_area); > >>> - > >>> -static int text_area_cpu_up(unsigned int cpu) > >>> -{ > >>> - struct vm_struct *area; > >>> - > >>> - area = get_vm_area(PAGE_SIZE, VM_ALLOC); > >>> - if (!area) { > >>> - WARN_ONCE(1, "Failed to create text area for cpu %d\n", > >>> - cpu); > >>> - return -1; > >>> - } > >>> - this_cpu_write(text_poke_area, area); > >>> - > >>> - return 0; > >>> -} > >>> - > >>> -static int text_area_cpu_down(unsigned int cpu) > >>> -{ > >>> - free_vm_area(this_cpu_read(text_poke_area)); > >>> - return 0; > >>> -} > >>> - > >>> -/* > >>> - * Run as a late init call. This allows all the boot time patching to be > >>> done > >>> - * simply by patching the code, and then we're called here prior to > >>> - * mark_rodata_ro(), which happens after all init calls are run. Although > >>> - * BUG_ON() is rude, in this case it should only happen if ENOMEM, and > >>> we judge > >>> - * it as being preferable to a kernel that will crash later when someone > >>> tries > >>> - * to use patch_instruction(). > >>> - */ > >>> -static int __init setup_text_poke_area(void) > >>> -{ > >>> - BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, > >>> - "powerpc/text_poke:online", text_area_cpu_up, > >>> - text_area_cpu_down)); > >>> - > >>> - return 0; > >>> -} > >>> -late_initcall(setup_text_poke_area); > >>> +struct patch_mapping { > >>> + spinlock_t *ptl; /* for protecting pte table */ > >>> + struct temp_mm temp_mm; > >>> +}; > >>> > >>> /* > >>> * This can be called for kernel text or a module. > >>> */ > >>> -static int map_patch_area(void *addr, unsigned long text_poke_addr) > >>> +static int map_patch(const void *addr, struct patch_mapping > >>> *patch_mapping) > >> > >> Why change the name ? > >> > > > > It's not really an "area" anymore. > > > >>> { > >>> - unsigned long pfn; > >>> - int err; > >>> + struct page *page; > >>> + pte_t pte, *ptep; > >>> + pgprot_t pgprot; > >>> > >>> if (is_vmalloc_addr(addr)) > >>> - pfn = vmalloc_to_pfn(addr); > >>> + page = vmalloc_to_page(addr); > >>> else > >>> - pfn = __pa_symbol(addr) >> PAGE_SHIFT; > >>> + page = virt_to_page(addr); > >>> > >>> - err = map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT), PAGE_KERNEL); > >>> + if (radix_enabled()) > >>> + pgprot = __pgprot(pgprot_val(PAGE_KERNEL)); > >>> + else > >>> + pgprot = PAGE_SHARED; > >> > >> Can you explain the difference between radix and non radix ? > >> > >> Why PAGE_KERNEL for a page that is mapped in userspace ? > >> > >> Why do you need to do __pgprot(pgprot_val(PAGE_KERNEL)) instead of just > >> using PAGE_KERNEL ? > >> > > > > On hash there is a manual check which prevents setting _PAGE_PRIVILEGED for > > kernel to userspace access in __hash_page - hence we cannot access the > > mapping > > if the page is mapped PAGE_KERNEL on hash. However, I would like to use > > PAGE_KERNEL here as well and am working on understanding why this check is > > done in hash and if this can change. On radix this works just fine. > > > > The page is mapped PAGE_KERNEL because the address is technically a > > userspace > > address - but only to keep the mapping local to this CPU doing the patching. > > PAGE_KERNEL makes it clear both in intent and protection that this is a > > kernel > > mapping. > > > > I think the correct way is pgprot_val(PAGE_KERNEL) since PAGE_KERNEL is > > defined > > as: > > > > #define PAGE_KERNEL __pgprot(_PAGE_BASE | _PAGE_KERNEL_RW) > > > > and __pgprot() is defined as: > > > > typedef struct { unsigned long pgprot; } pgprot_t; > > #define pgprot_val(x) ((x).pgprot) > > #define __pgprot(x) ((pgprot_t) { (x) }) > > > Yes, so: > pgprot_val(__pgprot(x)) == x > > > You do: > > pgprot = __pgprot(pgprot_val(PAGE_KERNEL)); > > Which is: > > pgprot = __pgprot(pgprot_val(__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW))); > > Which is equivalent to: > > pgprot = __pgprot(_PAGE_BASE | _PAGE_KERNEL_RW); > > So at the end it should simply be: > > pgprot = PAGE_KERNEL; >
Yes you're correct. Picking this up in the next spin. > > > > Christophe