> On April 15, 2020 3:45 AM Christophe Leroy <christophe.le...@c-s.fr> wrote:
> 
>  
> Le 15/04/2020 à 07:11, Christopher M Riedl a écrit :
> >> On March 24, 2020 11:25 AM Christophe Leroy <christophe.le...@c-s.fr> 
> >> wrote:
> >>
> >>   
> >> Le 23/03/2020 à 05:52, Christopher M. Riedl a écrit :
> >>> Currently, code patching a STRICT_KERNEL_RWX exposes the temporary
> >>> mappings to other CPUs. These mappings should be kept local to the CPU
> >>> doing the patching. Use the pre-initialized temporary mm and patching
> >>> address for this purpose. Also add a check after patching to ensure the
> >>> patch succeeded.
> >>>
> >>> Based on x86 implementation:
> >>>
> >>> commit b3fd8e83ada0
> >>> ("x86/alternatives: Use temporary mm for text poking")
> >>>
> >>> Signed-off-by: Christopher M. Riedl <c...@informatik.wtf>
> >>> ---
> >>>    arch/powerpc/lib/code-patching.c | 128 ++++++++++++++-----------------
> >>>    1 file changed, 57 insertions(+), 71 deletions(-)
> >>>
> >>> diff --git a/arch/powerpc/lib/code-patching.c 
> >>> b/arch/powerpc/lib/code-patching.c
> >>> index 18b88ecfc5a8..f156132e8975 100644
> >>> --- a/arch/powerpc/lib/code-patching.c
> >>> +++ b/arch/powerpc/lib/code-patching.c
> >>> @@ -19,6 +19,7 @@
> >>>    #include <asm/page.h>
> >>>    #include <asm/code-patching.h>
> >>>    #include <asm/setup.h>
> >>> +#include <asm/mmu_context.h>
> >>>    
> >>>    static int __patch_instruction(unsigned int *exec_addr, unsigned int 
> >>> instr,
> >>>                                  unsigned int *patch_addr)
> >>> @@ -65,99 +66,79 @@ void __init poking_init(void)
> >>>           pte_unmap_unlock(ptep, ptl);
> >>>    }
> >>>    
> >>> -static DEFINE_PER_CPU(struct vm_struct *, text_poke_area);
> >>> -
> >>> -static int text_area_cpu_up(unsigned int cpu)
> >>> -{
> >>> - struct vm_struct *area;
> >>> -
> >>> - area = get_vm_area(PAGE_SIZE, VM_ALLOC);
> >>> - if (!area) {
> >>> -         WARN_ONCE(1, "Failed to create text area for cpu %d\n",
> >>> -                 cpu);
> >>> -         return -1;
> >>> - }
> >>> - this_cpu_write(text_poke_area, area);
> >>> -
> >>> - return 0;
> >>> -}
> >>> -
> >>> -static int text_area_cpu_down(unsigned int cpu)
> >>> -{
> >>> - free_vm_area(this_cpu_read(text_poke_area));
> >>> - return 0;
> >>> -}
> >>> -
> >>> -/*
> >>> - * Run as a late init call. This allows all the boot time patching to be 
> >>> done
> >>> - * simply by patching the code, and then we're called here prior to
> >>> - * mark_rodata_ro(), which happens after all init calls are run. Although
> >>> - * BUG_ON() is rude, in this case it should only happen if ENOMEM, and 
> >>> we judge
> >>> - * it as being preferable to a kernel that will crash later when someone 
> >>> tries
> >>> - * to use patch_instruction().
> >>> - */
> >>> -static int __init setup_text_poke_area(void)
> >>> -{
> >>> - BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> >>> -         "powerpc/text_poke:online", text_area_cpu_up,
> >>> -         text_area_cpu_down));
> >>> -
> >>> - return 0;
> >>> -}
> >>> -late_initcall(setup_text_poke_area);
> >>> +struct patch_mapping {
> >>> + spinlock_t *ptl; /* for protecting pte table */
> >>> + struct temp_mm temp_mm;
> >>> +};
> >>>    
> >>>    /*
> >>>     * This can be called for kernel text or a module.
> >>>     */
> >>> -static int map_patch_area(void *addr, unsigned long text_poke_addr)
> >>> +static int map_patch(const void *addr, struct patch_mapping 
> >>> *patch_mapping)
> >>
> >> Why change the name ?
> >>
> > 
> > It's not really an "area" anymore.
> > 
> >>>    {
> >>> - unsigned long pfn;
> >>> - int err;
> >>> + struct page *page;
> >>> + pte_t pte, *ptep;
> >>> + pgprot_t pgprot;
> >>>    
> >>>           if (is_vmalloc_addr(addr))
> >>> -         pfn = vmalloc_to_pfn(addr);
> >>> +         page = vmalloc_to_page(addr);
> >>>           else
> >>> -         pfn = __pa_symbol(addr) >> PAGE_SHIFT;
> >>> +         page = virt_to_page(addr);
> >>>    
> >>> - err = map_kernel_page(text_poke_addr, (pfn << PAGE_SHIFT), PAGE_KERNEL);
> >>> + if (radix_enabled())
> >>> +         pgprot = __pgprot(pgprot_val(PAGE_KERNEL));
> >>> + else
> >>> +         pgprot = PAGE_SHARED;
> >>
> >> Can you explain the difference between radix and non radix ?
> >>
> >> Why PAGE_KERNEL for a page that is mapped in userspace ?
> >>
> >> Why do you need to do __pgprot(pgprot_val(PAGE_KERNEL)) instead of just
> >> using PAGE_KERNEL ?
> >>
> > 
> > On hash there is a manual check which prevents setting _PAGE_PRIVILEGED for
> > kernel to userspace access in __hash_page - hence we cannot access the 
> > mapping
> > if the page is mapped PAGE_KERNEL on hash. However, I would like to use
> > PAGE_KERNEL here as well and am working on understanding why this check is
> > done in hash and if this can change. On radix this works just fine.
> > 
> > The page is mapped PAGE_KERNEL because the address is technically a 
> > userspace
> > address - but only to keep the mapping local to this CPU doing the patching.
> > PAGE_KERNEL makes it clear both in intent and protection that this is a 
> > kernel
> > mapping.
> > 
> > I think the correct way is pgprot_val(PAGE_KERNEL) since PAGE_KERNEL is 
> > defined
> > as:
> > 
> > #define PAGE_KERNEL __pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
> > 
> > and __pgprot() is defined as:
> > 
> > typedef struct { unsigned long pgprot; } pgprot_t;
> > #define pgprot_val(x)   ((x).pgprot)
> > #define __pgprot(x)     ((pgprot_t) { (x) })
> 
> 
> Yes, so:
>       pgprot_val(__pgprot(x)) == x
> 
> 
> You do:
> 
>       pgprot = __pgprot(pgprot_val(PAGE_KERNEL));
> 
> Which is:
> 
>       pgprot = __pgprot(pgprot_val(__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)));
> 
> Which is equivalent to:
> 
>       pgprot = __pgprot(_PAGE_BASE | _PAGE_KERNEL_RW);
> 
> So at the end it should simply be:
> 
>       pgprot = PAGE_KERNEL;
> 

Yes you're correct. Picking this up in the next spin.

> 
> 
> 
> Christophe

Reply via email to