Hello Ram,

Ram Pai <linux...@us.ibm.com> writes:

> Applications need the ability to associate an address-range with some
> key and latter revert to its initial default key. Pkey-0 comes close to
> providing this function but falls short, because the current
> implementation disallows applications to explicitly associate pkey-0 to
> the address range.
>
> Lets make pkey-0 less special and treat it almost like any other key.
> Thus it can be explicitly associated with any address range, and can be
> freed. This gives the application more flexibility and power.  The
> ability to free pkey-0 must be used responsibily, since pkey-0 is
> associated with almost all address-range by default.
>
> Even with this change pkey-0 continues to be slightly more special
> from the following point of view.
> (a) it is implicitly allocated.
> (b) it is the default key assigned to any address-range.

It's also special in more ways (and if intentional, these should be part
of the commit message as well):

(c) it's not possible to change permissions for key 0

  This has two causes: this patch explicitly forbids it in
  arch_set_user_pkey_access(), and also because even if it's allocated,
  the bits for key 0 in AMOR and UAMOR aren't set.

(d) it can be freed, but can't be allocated again later.

  This is because mm_pkey_alloc() only calls __arch_activate_pkey(ret)
  if ret > 0.

It looks like (d) is a bug. Either mm_pkey_free() should fail with key
0, or mm_pkey_alloc() should work with it.

(c) could be a measure to prevent users from shooting themselves in
their feet. But if that is the case, then mm_pkey_free() should forbid
freeing key 0 too.

> Tested on powerpc.
>
> cc: Thomas Gleixner <t...@linutronix.de>
> cc: Dave Hansen <dave.han...@intel.com>
> cc: Michael Ellermen <m...@ellerman.id.au>
> cc: Ingo Molnar <mi...@kernel.org>
> cc: Andrew Morton <a...@linux-foundation.org>
> Signed-off-by: Ram Pai <linux...@us.ibm.com>
> ---
> History:
>       v2: mm_pkey_is_allocated() continued to treat pkey-0 special.
>           fixed it.
>
>  arch/powerpc/include/asm/pkeys.h | 20 ++++++++++++++++----
>  arch/powerpc/mm/pkeys.c          | 20 ++++++++++++--------
>  2 files changed, 28 insertions(+), 12 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/pkeys.h 
> b/arch/powerpc/include/asm/pkeys.h
> index 0409c80..b598fa9 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -101,10 +101,14 @@ static inline u16 pte_to_pkey_bits(u64 pteflags)
>
>  static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
>  {
> -     /* A reserved key is never considered as 'explicitly allocated' */
> -     return ((pkey < arch_max_pkey()) &&
> -             !__mm_pkey_is_reserved(pkey) &&
> -             __mm_pkey_is_allocated(mm, pkey));
> +     if (pkey < 0 || pkey >= arch_max_pkey())
> +             return false;
> +
> +     /* Reserved keys are never allocated. */
> +     if (__mm_pkey_is_reserved(pkey))
> +             return false;
> +
> +     return __mm_pkey_is_allocated(mm, pkey);
>  }
>
>  extern void __arch_activate_pkey(int pkey);
> @@ -200,6 +204,14 @@ static inline int arch_set_user_pkey_access(struct 
> task_struct *tsk, int pkey,
>  {
>       if (static_branch_likely(&pkey_disabled))
>               return -EINVAL;
> +
> +     /*
> +      * userspace is discouraged from changing permissions of
> +      * pkey-0.

They're not discouraged. They're not allowed to. :-)

> +      * powerpc hardware does not support it anyway.

It doesn't? I don't get that impression from reading the ISA, but
perhaps I'm missing something.

> +      */
> +     if (!pkey)
> +             return init_val ? -EINVAL : 0;
> +
>       return __arch_set_user_pkey_access(tsk, pkey, init_val);
>  }
>
> diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> index ba71c54..e7a9e34 100644
> --- a/arch/powerpc/mm/pkeys.c
> +++ b/arch/powerpc/mm/pkeys.c
> @@ -119,19 +119,21 @@ int pkey_initialize(void)
>  #else
>       os_reserved = 0;
>  #endif
> -     /*
> -      * Bits are in LE format. NOTE: 1, 0 are reserved.
> -      * key 0 is the default key, which allows read/write/execute.
> -      * key 1 is recommended not to be used. PowerISA(3.0) page 1015,
> -      * programming note.
> -      */
> +     /* Bits are in LE format. */
>       initial_allocation_mask = ~0x0;
>
>       /* register mask is in BE format */
>       pkey_amr_uamor_mask = ~0x0ul;
>       pkey_iamr_mask = ~0x0ul;
>
> -     for (i = 2; i < (pkeys_total - os_reserved); i++) {
> +     for (i = 0; i < (pkeys_total - os_reserved); i++) {
> +             /*

There's a space between the tabs here.

> +              * key 1 is recommended not to be used.
> +              * PowerISA(3.0) page 1015,
> +              */
> +             if (i == 1)
> +                     continue;
> +
>               initial_allocation_mask &= ~(0x1 << i);
>               pkey_amr_uamor_mask &= ~(0x3ul << pkeyshift(i));
>               pkey_iamr_mask &= ~(0x1ul << pkeyshift(i));
> @@ -145,7 +147,9 @@ void pkey_mm_init(struct mm_struct *mm)
>  {
>       if (static_branch_likely(&pkey_disabled))
>               return;
> -     mm_pkey_allocation_map(mm) = initial_allocation_mask;
> +
> +     /* allocate key-0 by default */
> +     mm_pkey_allocation_map(mm) = initial_allocation_mask | 0x1;
>       /* -1 means unallocated or invalid */
>       mm->context.execute_only_pkey = -1;
>  }

I think we should also set the AMOR and UAMOR bits for key 0. Otherwise,
key 0 will be in allocated-but-not-enabled state which is yet another
subtle way in which it will be special.

Also, pkey_access_permitted() has a special case for key 0. Should it?

--
Thiago Jung Bauermann
IBM Linux Technology Center

Reply via email to