On Tue, May 26, 2026 at 02:04:52PM +0100, Kiryl Shutsemau wrote:
> From: "Kiryl Shutsemau (Meta)" <[email protected]>
>
> vma_flags_t is one unsigned long on 32-bit -- NUM_VMA_FLAG_BITS ==
> BITS_PER_LONG by design, so VM_xxx-declared bits sit in the first
> word and hit the single-long fast path. But the bit enum declares
> some bits unconditionally above BITS_PER_LONG (VMA_UFFD_MINOR_BIT
> == 41 today, with VM_UFFD_MINOR == VM_NONE on 32-bit so no VMA
> actually carries the bit).

Yeah ugh.

>
> Passing such a bit to mk_vma_flags() goes through __set_bit(41,
> &one_long) and writes one word past the end. The compiler folds
> the OOB store with wraparound (1UL << (41 % 32) == bit 9) into
> the first word. Bit 9 is already in __VMA_UFFD_FLAGS so the mask
> happens to come out right today, but any high-numbered bit whose

That is... helpful :) but not great that this is the situation, an
oversight, clearly! How I hate 32-bit kernels :)

> mod-BITS_PER_LONG position is otherwise unused would silently OR
> an extra bit into the mask.
>
> Add VMA_NO_BIT and have DECLARE_VMA_BIT() resolve any bitnum out
> of range to it. vma_flags_set_flag() drops negative bit values.
> The ternary collapses at compile time, the runtime check folds
> away when the bit is in range, and the common path is unchanged.

Hmm are you sure it does?

A key design goal was that mk_vma_flags() generates compile-time constants
the same as if the bitmap were constructed independently.

This surely must generate code? Or at least runs a significant risk of it?

Setting a precedent here would probably lead to VMA_NO_BIT being used more
and therefore generating code in more places I think.

And I'm not sure I really want to bend over backwards here to work around
issues with 32-bit kernels when in the long run the intent is that we move
to making these values 64-bit anyway.

Perhaps it's even wise possibly to just make these values 64-bit already,
ahead of time?

(I did look at this in terms of wanting something like a VMA_NO_BIT so we
could get VM_NONE-like behaviour but nothing I tried failed to generate
code.)

A simple solution that doesn't require change to the core is to just uglify
userfaultfd_k.h a bit with:

#ifdef HAVE_ARCH_USERFAULTFD_MINOR
#define __VMA_UFFD_FLAGS mk_vma_flags(VMA_UFFD_MISSING_BIT, VMA_UFFD_WP_BIT, \
                                      VMA_UFFD_MINOR_BIT)
#else
#define __VMA_UFFD_FLAGS mk_vma_flags(VMA_UFFD_MISSING_BIT, VMA_UFFD_WP_BIT)
#endif

But of course that becomes much more horrible with your changes...

Another alternative, which I used for VMA_DROPPABLE is to add something
like this in mm.h:

#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR
#define VM_UFFD_MINOR   INIT_VM_FLAG(UFFD_MINOR)
+define VMA_UFFD_MINOR  mk_vma_flags(VMA_UFFD_MINOR_BIT)
#else
#define VM_UFFD_MINOR   VM_NONE
+define VMA_UFFD_MINOR  EMPTY_VMA_FLAGS
#endif

Then we can do:

#define __VMA_UFFD_FLAGS append_vma_flags(VMA_MINOR, VMA_UFFD_MISSING_BIT, \
                                          VMA_UFFD_WP_BIT)

With you changes in 08/18 on top it'd get hairier, but we could make our
life easier by implementing something like:

static __always_inline vma_flags_t __mk_vma_flags_from_masks(size_t count,
        const vma_flags_t *masks)
{
        vma_flags_t flags = EMPTY_VMA_FLAGS;
        int i;

        for (i = 0; i < count; i++)
                mask = vma_flags_set_mask(&flags, masks[i]);

        return flags;
}

#define mk_vma_flags_from_masks(...) __mk_vma_flags_from_masks(, \
                COUNT_ARGS(__VA_ARGS__), (const vma_flags_t []){__VA_ARGS__})

(untested code - you would need to ensure it generated equivalent
constants as VM_xxx would now :)

Then you could get VMA_UFFD_RWP with:

#ifdef CONFIG_USERFAULTFD_RWP
#define VMA_UFFD_RWP    mk_vma_flags(VMA_UFFD_RWP_BIT)
#else
#define VMA_UFFD_RWP    EMPTY_VMA_FLAGS
#endif

And then:


/* Always available if CONFIG_USERFAULTFD set. */
#define __VMA_UFFD_DEFAULT_FLAGS \
                mk_vma_flags(VMA_UFFD_MISSING_BIT, VMA_UFFD_WP_BIT)

#define __VMA_UFFD_FLAGS mk_vma_flags_from_masks(__VMA_UFFD_DEFAULT_FLAGS \
                VMA_MINOR, VMA_RWP)

Which is kind ok-ish right? I mean it's all a bit fugly obviously.

>
> Bits declared in the enum are now safe to pass to mk_vma_flags()
> regardless of arch.

I mean another issue here is we're then codifying behaviour that's legacy
ahead of time. I really want to avoid that.

>
> Fixes: 9ea35a25d51b ("mm: introduce VMA flags bitmap type")
> Cc: [email protected]
> Signed-off-by: Kiryl Shutsemau <[email protected]>
> ---
>  include/linux/mm.h | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0f2612a70fb1..71b11945e4fc 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -286,8 +286,17 @@ extern unsigned int kobjsize(const void *objp);
>   */
>  typedef int __bitwise vma_flag_t;
>
> -#define DECLARE_VMA_BIT(name, bitnum) \
> -     VMA_ ## name ## _BIT = ((__force vma_flag_t)bitnum)
> +/*
> + * VMA_NO_BIT means "no bit"; mk_vma_flags() skips it. DECLARE_VMA_BIT()
> + * below uses it for any bit number that doesn't fit in the bitmap, so
> + * callers don't need to track which bits are valid on the current build.
> + */
> +#define VMA_NO_BIT   ((__force vma_flag_t)-1)
> +
> +#define DECLARE_VMA_BIT(name, bitnum)                                        
> \
> +     VMA_ ## name ## _BIT = (((bitnum) < NUM_VMA_FLAG_BITS) ?        \
> +                             ((__force vma_flag_t)(bitnum)) :        \
> +                             VMA_NO_BIT)
>  #define DECLARE_VMA_BIT_ALIAS(name, aliased) \
>       VMA_ ## name ## _BIT = (VMA_ ## aliased ## _BIT)
>  enum {
> @@ -1081,6 +1090,8 @@ static __always_inline void 
> vma_flags_set_flag(vma_flags_t *flags,
>  {
>       unsigned long *bitmap = flags->__vma_flags;
>
> +     if ((__force int)bit < 0)
> +             return;
>       __set_bit((__force int)bit, bitmap);
>  }
>
> --
> 2.54.0
>

Either way, I think we should break out any fix like this from the series.

Andrew is I think also not a fan of fixes patches in the middle of series
anyway :P

Cheers, Lorenzo

Reply via email to