On Mon, Jun 08, 2026 at 12:23:07PM +0100, Lorenzo Stoakes wrote:
> On Mon, Jun 08, 2026 at 04:36:38AM -0400, Michael S. Tsirkin wrote:
> > When post_alloc_hook() needs to zero a page for an explicit
> > __GFP_ZERO allocation for a user page (user_addr is set), use
> > folio_zero_user()
> > instead of kernel_init_pages(). This zeros near the faulting
> > address last, keeping those cachelines hot for the impending
> > user access.
> >
> > folio_zero_user() is only used for explicit __GFP_ZERO, not for
> > init_on_alloc. On architectures with virtually-indexed caches
> > (e.g., ARM), clear_user_highpage() performs per-line cache
> > operations; using it for init_on_alloc would add overhead that
> > kernel_init_pages() avoids (the page fault path flushes the
> > cache at PTE installation time regardless).
> >
> > No functional change yet: current callers do not pass __GFP_ZERO
> > for user pages (they zero at the callsite instead). Subsequent
> > patches will convert them.
> >
> > Signed-off-by: Michael S. Tsirkin <[email protected]>
> > Assisted-by: Claude:claude-opus-4-6
> > ---
> > mm/page_alloc.c | 35 ++++++++++++++++++++++++++++++++---
> > 1 file changed, 32 insertions(+), 3 deletions(-)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 4676fd49819e..d4fbf1861a8a 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -1861,9 +1861,38 @@ inline void post_alloc_hook(struct page *page,
> > unsigned int order,
> > for (i = 0; i != 1 << order; ++i)
> > page_kasan_tag_reset(page + i);
> > }
> > - /* If memory is still not initialized, initialize it now. */
> > - if (init)
> > - kernel_init_pages(page, 1 << order);
> > + /*
> > + * On architectures with cache aliasing, pages zeroed via the
> > + * kernel direct map (e.g. init_on_free) must be re-zeroed
> > + * through a user-congruent mapping. Host-zeroed pages
> > + * (zeroed flag) don't need this: physical RAM is clean.
> > + */
> > + if (!init && (gfp_flags & __GFP_ZERO) &&
> > + user_addr != USER_ADDR_NONE &&
> > + user_alloc_needs_zeroing())
>
> We check this (gfp_flags & __GFP_ZERO) && user_addr != USER_ADDR_NONE thing
> twice, can we just put in a 'init_should_folio_zero' const bool or something?
>
> > + init = true;
>
> As Vlasta says not sure if we want to add complexity just for these arches.
>
> > + /*
> > + * If memory is still not initialized, initialize it now.
>
> I kinda hate that 'init' is unclear as to 'do init' or 'was init somewhere
> else'... Anwyay.
>
> > + * When __GFP_ZERO was explicitly requested and user_addr is set,
> > + * use folio_zero_user() which zeros near the faulting address
> > + * last, keeping those cachelines hot. For init_on_alloc, use
> > + * kernel_init_pages() to avoid unnecessary cache flush overhead
> > + * on architectures with virtually-indexed caches.
>
> This whole paragraph seems pretty useless and just describing the code?
>
> > + */
> > + if (init) {
> > + if ((gfp_flags & __GFP_ZERO) && user_addr != USER_ADDR_NONE) {
> > + /*
> > + * folio_zero_user relies on folio_nr_pages which
> > + * requires __GFP_COMP for order > 0. All user folio
> > + * allocations set __GFP_COMP via __folio_alloc.
>
> This whole paragraph is useless and very like the kind of stuff AI generates
> for
> comments, i.e. overly long + entirely unnecessary stuff.
It was an attempt to make sashiko shut up, it doesn't understand the
context and kept complaining. Didn't really help so yea I should drop this.
>
> > + * user_addr != USER_ADDR_NONE implies sleepable
> > + * context (user page fault).
>
> Can you safely assume that? Also inferring which context we are in from this
> parameter seems risky.
>
> It seems to me that you're now making it such that kernel developers:
>
> - Have to know when and when not to specify a user address, and under what
> circumstances we might consider that to be mapped.
>
> - Need to know to do this correctly for aliasing architectures or have silent
> correctness issues.
>
> - Need to take context into account when specifying this.
>
> We definitely need to find a simpler way to do this!
>
> > + */
> > + VM_WARN_ON_ONCE(order && !(gfp_flags & __GFP_COMP));
>
> Surely by now we can assume this?
Another attempt to make it obvious.
> > + folio_zero_user(page_folio(page), user_addr);
> > + } else
> > + kernel_init_pages(page, 1 << order);
>
> I hate this hanging else branch... definitely prefer {} on both branches.
>
> But in any case it seems like we could avoid some indentation with something
> like:
>
> if (init && init_should_folio_zero) {
> ...
> } else if (init) {
> ...
> }
>
> Or even a:
>
> if (!init)
> goto out;
>
> And stick an out label below?
>
> > + }
>
> >
> > set_page_owner(page, order, gfp_flags);
> > page_table_check_alloc(page, order);
> > --
> > MST
> >
>
> Oh and in general it seems that this conflicts with [0] which removes
> kernel_init_pages().
>
> [0];https://lore.kernel.org/all/[email protected]/
>
> Thanks, Lorenzo