Applications need the ability to associate an address-range with some key and latter revert to its initial default key. Pkey-0 comes close to providing this function but falls short, because the current implementation disallows applications to explicitly associate pkey-0 to the address range.
Clarify the semantics of pkey-0 and provide the corresponding implementation. Pkey-0 is special with the following semantics. (a) it is implicitly allocated and can never be freed. It always exists. (b) it is the default key assigned to any address-range. (c) it can be explicitly associated with any address-range. Tested on powerpc only. Could not test on x86. cc: Thomas Gleixner <t...@linutronix.de> cc: Dave Hansen <dave.han...@intel.com> cc: Michael Ellermen <m...@ellerman.id.au> cc: Ingo Molnar <mi...@kernel.org> cc: Andrew Morton <a...@linux-foundation.org> Signed-off-by: Ram Pai <linux...@us.ibm.com> --- History: v4 : (1) moved the code entirely in arch-independent location. (2) fixed comments -- suggested by Thomas Gliexner v3 : added clarification of the semantics of pkey0. -- suggested by Dave Hansen v2 : split the patch into two, one for x86 and one for powerpc -- suggested by Michael Ellermen Documentation/x86/protection-keys.txt | 8 ++++++++ mm/mprotect.c | 25 ++++++++++++++++++++++--- 2 files changed, 30 insertions(+), 3 deletions(-) diff --git a/Documentation/x86/protection-keys.txt b/Documentation/x86/protection-keys.txt index ecb0d2d..92802c4 100644 --- a/Documentation/x86/protection-keys.txt +++ b/Documentation/x86/protection-keys.txt @@ -88,3 +88,11 @@ with a read(): The kernel will send a SIGSEGV in both cases, but si_code will be set to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when the plain mprotect() permissions are violated. + +====================== pkey 0 ================================== + +Pkey-0 is special. It is implicitly allocated. Applications cannot allocate or +free that key. This key is the default key that gets associated with a +addres-space. It can be explicitly associated with any address-space. + +================================================================ diff --git a/mm/mprotect.c b/mm/mprotect.c index e3309fc..2c779fa 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -430,7 +430,13 @@ static int do_mprotect_pkey(unsigned long start, size_t len, * them use it here. */ error = -EINVAL; - if ((pkey != -1) && !mm_pkey_is_allocated(current->mm, pkey)) + + /* + * pkey-0 is special. It always exists. No need to check if it is + * allocated. Check allocation status of all other keys. pkey=-1 + * is not realy a key, it means; use any available key. + */ + if (pkey && pkey != -1 && !mm_pkey_is_allocated(current->mm, pkey)) goto out; vma = find_vma(current->mm, start); @@ -549,6 +555,12 @@ static int do_mprotect_pkey(unsigned long start, size_t len, if (pkey == -1) goto out; + if (!pkey) { + mm_pkey_free(current->mm, pkey); + printk("Internal error, cannot explicitly allocate key-0"); + goto out; + } + ret = arch_set_user_pkey_access(current, pkey, init_val); if (ret) { mm_pkey_free(current->mm, pkey); @@ -564,13 +576,20 @@ static int do_mprotect_pkey(unsigned long start, size_t len, { int ret; + /* + * pkey-0 is special. Userspace can never allocate or free it. It is + * allocated by default. It always exists. + */ + if (!pkey) + return -EINVAL; + down_write(¤t->mm->mmap_sem); ret = mm_pkey_free(current->mm, pkey); up_write(¤t->mm->mmap_sem); /* - * We could provie warnings or errors if any VMA still - * has the pkey set here. + * We could provide warnings or errors if any VMA still has the pkey + * set here. */ return ret; } -- 1.7.1