> On April 8, 2020 6:01 AM Christophe Leroy <christophe.le...@c-s.fr> wrote: > > > Le 31/03/2020 à 05:19, Christopher M Riedl a écrit : > >> On March 24, 2020 11:10 AM Christophe Leroy <christophe.le...@c-s.fr> > >> wrote: > >> > >> > >> Le 23/03/2020 à 05:52, Christopher M. Riedl a écrit : > >>> When code patching a STRICT_KERNEL_RWX kernel the page containing the > >>> address to be patched is temporarily mapped with permissive memory > >>> protections. Currently, a per-cpu vmalloc patch area is used for this > >>> purpose. While the patch area is per-cpu, the temporary page mapping is > >>> inserted into the kernel page tables for the duration of the patching. > >>> The mapping is exposed to CPUs other than the patching CPU - this is > >>> undesirable from a hardening perspective. > >>> > >>> Use the `poking_init` init hook to prepare a temporary mm and patching > >>> address. Initialize the temporary mm by copying the init mm. Choose a > >>> randomized patching address inside the temporary mm userspace address > >>> portion. The next patch uses the temporary mm and patching address for > >>> code patching. > >>> > >>> Based on x86 implementation: > >>> > >>> commit 4fc19708b165 > >>> ("x86/alternatives: Initialize temporary mm for patching") > >>> > >>> Signed-off-by: Christopher M. Riedl <c...@informatik.wtf> > >>> --- > >>> arch/powerpc/lib/code-patching.c | 26 ++++++++++++++++++++++++++ > >>> 1 file changed, 26 insertions(+) > >>> > >>> diff --git a/arch/powerpc/lib/code-patching.c > >>> b/arch/powerpc/lib/code-patching.c > >>> index 3345f039a876..18b88ecfc5a8 100644 > >>> --- a/arch/powerpc/lib/code-patching.c > >>> +++ b/arch/powerpc/lib/code-patching.c > >>> @@ -11,6 +11,8 @@ > >>> #include <linux/cpuhotplug.h> > >>> #include <linux/slab.h> > >>> #include <linux/uaccess.h> > >>> +#include <linux/sched/task.h> > >>> +#include <linux/random.h> > >>> > >>> #include <asm/pgtable.h> > >>> #include <asm/tlbflush.h> > >>> @@ -39,6 +41,30 @@ int raw_patch_instruction(unsigned int *addr, unsigned > >>> int instr) > >>> } > >>> > >>> #ifdef CONFIG_STRICT_KERNEL_RWX > >>> + > >>> +__ro_after_init struct mm_struct *patching_mm; > >>> +__ro_after_init unsigned long patching_addr; > >> > >> Can we make those those static ? > >> > > > > Yes, makes sense to me. > > > >>> + > >>> +void __init poking_init(void) > >>> +{ > >>> + spinlock_t *ptl; /* for protecting pte table */ > >>> + pte_t *ptep; > >>> + > >>> + patching_mm = copy_init_mm(); > >>> + BUG_ON(!patching_mm); > >> > >> Does it needs to be a BUG_ON() ? Can't we fail gracefully with just a > >> WARN_ON ? > >> > > > > I'm not sure what failing gracefully means here? The main reason this could > > fail is if there is not enough memory to allocate the patching_mm. The > > previous implementation had this justification for BUG_ON(): > > But the system can continue running just fine after this failure. > Only the things that make use of code patching will fail (ftrace, kgdb, ...) > > Checkpatch tells: "Avoid crashing the kernel - try using WARN_ON & > recovery code rather than BUG() or BUG_ON()" > > All vital code patching has already been done previously, so I think a > WARN_ON() should be enough, plus returning non 0 to indicate that the > late_initcall failed. > >
Got it, makes sense to me. I will make these changes in the next version. Thanks! > > > > /* > > * Run as a late init call. This allows all the boot time patching to be > > done > > * simply by patching the code, and then we're called here prior to > > * mark_rodata_ro(), which happens after all init calls are run. Although > > * BUG_ON() is rude, in this case it should only happen if ENOMEM, and we > > judge > > * it as being preferable to a kernel that will crash later when someone > > tries > > * to use patch_instruction(). > > */ > > static int __init setup_text_poke_area(void) > > { > > BUG_ON(!cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, > > "powerpc/text_poke:online", text_area_cpu_up, > > text_area_cpu_down)); > > > > return 0; > > } > > late_initcall(setup_text_poke_area); > > > > I think the BUG_ON() is appropriate even if only to adhere to the previous > > judgement call. I can add a similar comment explaining the reasoning if > > that helps. > > > >>> + > >>> + /* > >>> + * In hash we cannot go above DEFAULT_MAP_WINDOW easily. > >>> + * XXX: Do we want additional bits of entropy for radix? > >>> + */ > >>> + patching_addr = (get_random_long() & PAGE_MASK) % > >>> + (DEFAULT_MAP_WINDOW - PAGE_SIZE); > >>> + > >>> + ptep = get_locked_pte(patching_mm, patching_addr, &ptl); > >>> + BUG_ON(!ptep); > >> > >> Same here, can we fail gracefully instead ? > >> > > > > Same reasoning as above. > > Here as well, a WARN_ON() should be enough, the system will continue > running after that. > > > > >>> + pte_unmap_unlock(ptep, ptl); > >>> +} > >>> + > >>> static DEFINE_PER_CPU(struct vm_struct *, text_poke_area); > >>> > >>> static int text_area_cpu_up(unsigned int cpu) > >>> > >> > >> Christophe > > Christophe