> On Aug 28, 2019, at 3:31 PM, Thomas Gleixner <t...@linutronix.de> wrote: > > ftrace does not use text_poke() for enabling trace functionality. It uses > its own mechanism and flips the whole kernel text to RW and back to RO. > > The CPA rework removed a loop based check of 4k pages which tried to > preserve a large page by checking each 4k page whether the change would > actually cover all pages in the large page. > > This resulted in endless loops for nothing as in testing it turned out that > it actually never preserved anything. Of course testing missed to include > ftrace, which is the one and only case which benefitted from the 4k loop. > > As a consequence enabling function tracing or ftrace based kprobes results > in a full 4k split of the kernel text, which affects iTLB performance. > > The kernel RO protection is the only valid case where this can actually > preserve large pages. > > All other static protections (RO data, data NX, PCI, BIOS) are truly > static. So a conflict with those protections which results in a split > should only ever happen when a change of memory next to a protected region > is attempted. But these conflicts are rightfully splitting the large page > to preserve the protected regions. In fact a change to the protected > regions itself is a bug and is warned about. > > Add an exception for the static protection check for kernel text RO when > the to be changed region spawns a full large page which allows to preserve > the large mappings. This also prevents the syslog to be spammed about CPA > violations when ftrace is used. > > The exception needs to be removed once ftrace switched over to text_poke() > which avoids the whole issue. > > Fixes: 585948f4f695 ("x86/mm/cpa: Avoid the 4k pages check completely") > Reported-by: Song Liu <songliubrav...@fb.com> > Signed-off-by: Thomas Gleixner <t...@linutronix.de> > Cc: sta...@vger.kernel.org
This looks great. Much cleaner than my workaround. Thanks! Reviewed-and-tested-by: Song Liu <songliubrav...@fb.com> We need this for v4.20 to v5.3 (assuming Peter's patches will land in 5.4). > --- > arch/x86/mm/pageattr.c | 26 ++++++++++++++++++-------- > 1 file changed, 18 insertions(+), 8 deletions(-) > > --- a/arch/x86/mm/pageattr.c > +++ b/arch/x86/mm/pageattr.c > @@ -516,7 +516,7 @@ static inline void check_conflict(int wa > */ > static inline pgprot_t static_protections(pgprot_t prot, unsigned long start, > unsigned long pfn, unsigned long npg, > - int warnlvl) > + unsigned long lpsize, int warnlvl) > { > pgprotval_t forbidden, res; > unsigned long end; > @@ -535,9 +535,17 @@ static inline pgprot_t static_protection > check_conflict(warnlvl, prot, res, start, end, pfn, "Text NX"); > forbidden = res; > > - res = protect_kernel_text_ro(start, end); > - check_conflict(warnlvl, prot, res, start, end, pfn, "Text RO"); > - forbidden |= res; > + /* > + * Special case to preserve a large page. If the change spawns the > + * full large page mapping then there is no point to split it > + * up. Happens with ftrace and is going to be removed once ftrace > + * switched to text_poke(). > + */ > + if (lpsize != (npg * PAGE_SIZE) || (start & (lpsize - 1))) { > + res = protect_kernel_text_ro(start, end); > + check_conflict(warnlvl, prot, res, start, end, pfn, "Text RO"); > + forbidden |= res; > + } > > /* Check the PFN directly */ > res = protect_pci_bios(pfn, pfn + npg - 1); > @@ -819,7 +827,7 @@ static int __should_split_large_page(pte > * extra conditional required here. > */ > chk_prot = static_protections(old_prot, lpaddr, old_pfn, numpages, > - CPA_CONFLICT); > + psize, CPA_CONFLICT); > > if (WARN_ON_ONCE(pgprot_val(chk_prot) != pgprot_val(old_prot))) { > /* > @@ -855,7 +863,7 @@ static int __should_split_large_page(pte > * protection requirement in the large page. > */ > new_prot = static_protections(req_prot, lpaddr, old_pfn, numpages, > - CPA_DETECT); > + psize, CPA_DETECT); > > /* > * If there is a conflict, split the large page. > @@ -906,7 +914,8 @@ static void split_set_pte(struct cpa_dat > if (!cpa->force_static_prot) > goto set; > > - prot = static_protections(ref_prot, address, pfn, npg, CPA_PROTECT); > + /* Hand in lpsize = 0 to enforce the protection mechanism */ > + prot = static_protections(ref_prot, address, pfn, npg, 0, CPA_PROTECT); > > if (pgprot_val(prot) == pgprot_val(ref_prot)) > goto set; > @@ -1503,7 +1512,8 @@ static int __change_page_attr(struct cpa > pgprot_val(new_prot) |= pgprot_val(cpa->mask_set); > > cpa_inc_4k_install(); > - new_prot = static_protections(new_prot, address, pfn, 1, > + /* Hand in lpsize = 0 to enforce the protection mechanism */ > + new_prot = static_protections(new_prot, address, pfn, 1, 0, > CPA_PROTECT); > > new_prot = pgprot_clear_protnone_bits(new_prot);