llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT--> @llvm/pr-subscribers-clang Author: Oliver Hunt (ojhunt) <details> <summary>Changes</summary> This updates the pointer authentication documentation to include a complete description of the existing functionaliy and behaviour, details of the more complex aspects of the semantics and security properties, and the Apple arm64e ABI design. Co-authored-by: Ahmed Bougacha Co-authored-by: Akira Hatanaka Co-authored-by: John Mccall --- Patch is 57.39 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/152596.diff 1 Files Affected: - (modified) clang/docs/PointerAuthentication.rst (+1105-20) ``````````diff diff --git a/clang/docs/PointerAuthentication.rst b/clang/docs/PointerAuthentication.rst index 913291c954447..18e82152b583c 100644 --- a/clang/docs/PointerAuthentication.rst +++ b/clang/docs/PointerAuthentication.rst @@ -47,14 +47,14 @@ This document serves four purposes: - It documents several language extensions that are useful on targets using pointer authentication. -- It will eventually present a theory of operation for the security mitigation, - describing the basic requirements for correctness, various weaknesses in the - mechanism, and ways in which programmers can strengthen its protections - (including recommendations for language implementors). +- It presents a theory of operation for the security mitigation, describing the + basic requirements for correctness, various weaknesses in the mechanism, and + ways in which programmers can strengthen its protections (including + recommendations for language implementors). + +- It documents the language ABIs currently used for C, C++, and Objective-C + on arm64e. -- It will eventually document the language ABIs currently used for C, C++, - Objective-C, and Swift on arm64e, although these are not yet stable on any - target. Basic Concepts -------------- @@ -125,7 +125,7 @@ independently for I and D keys.) interfaces or as primitives in a compiler IR because they expose raw pointers. Raw pointers require special attention in the language implementation to avoid the accidental creation of exploitable code - sequences. + sequences; see the section on `Attackable code sequences`_. The following details are all implementation-defined: @@ -163,14 +163,20 @@ a cryptographic signature, other implementations may be possible. See data is simply a pepper added to the hash, not an encryption key, and so can be initialized using random data. + *** FIXME *** fpac? - ``sign`` computes a cryptographic hash of the pointer, discriminator, and signing key, and stores it in the high bits as the signature. ``auth`` removes the signature, computes the same hash, and compares the result with the stored signature. ``strip`` removes the signature without - authenticating it. While ``aut*`` instructions do not themselves trap on - failure in Armv8.3 PAuth, they do with the later optional FPAC extension. - An implementation can also choose to emulate this trapping behavior by - emitting additional instructions around ``aut*``. + authenticating it. The ``aut`` instructions in the baseline Armv8.3 PAuth + feature do not guarantee to trap on authentication failure; instead, they + simply corrupt the pointer so that later uses will likely trap. Unless the + "later use" follows immediately and cannot be recovered from (e.g. with a + signal handler), this does not provide adequate protection against + `authentication oracles`_, so implementations must emit additional + instructions to force an immediate trap. This is unnecessary if the + processor provides the optional ``FPAC`` extension, which guarantees an + immediate trap. - ``sign_generic`` corresponds to the ``pacga`` instruction, which takes two 64-bit values and produces a 64-bit cryptographic hash. Implementations of @@ -255,17 +261,133 @@ signing schema breaks down even more simply: It is important that the signing schema be independently derived at all signing and authentication sites. Preferably, the schema should be hard-coded everywhere it is needed, but at the very least, it must not be derived by -inspecting information stored along with the pointer. +inspecting information stored along with the pointer. See the section on +`Attacks on pointer authentication`_ for more information. + Language Features ----------------- -There is currently one main pointer authentication language feature: +There are three levels of the pointer authentication language feature: + +- The language implementation automatically signs and authenticates function + pointers (and certain data pointers) across a variety of standard situations, + including return addresses, function pointers, and C++ virtual functions. The + intent is for all pointers to code in program memory to be signed in some way + and for all branches to code in program text to authenticate those + signatures. + +- The language also provides extensions to override the default rules used by + the language implementation. For example, the ``__ptrauth`` type qualifier + can be used to change how pointers are signed when they are stored in + a particular variable or field; this provides much stronger protection than + is guaranteed by the default rules for C function and data pointers. -- The language provides the ``<ptrauth.h>`` intrinsic interface for manually - signing and authenticating pointers in code. These can be used in +- Finally, the language provides the ``<ptrauth.h>`` intrinsic interface for + manually signing and authenticating pointers in code. These can be used in circumstances where very specific behavior is required. +Language Implementation +~~~~~~~~~~~~~~~~~~~~~~~ + +For the most part, pointer authentication is an unobserved detail of the +implementation of the programming language. Any element of the language +implementation that would perform an indirect branch to a pointer is implicitly +altered so that the pointer is signed when first constructed and authenticated +when the branch is performed. This includes: + +- indirect-call features in the programming language, such as C function + pointers, C++ virtual functions, C++ member function pointers, the "blocks" + C extension, and so on; + +- returning from a function, no matter how it is called; and + +- indirect calls introduced by the implementation, such as branches through the + global offset table (GOT) used to implement direct calls to functions defined + outside of the current shared object. + +For more information about this, see the `Language ABI`_ section. + +However, some aspects of the implementation are observable by the programmer or +otherwise require special notice. + +C data pointers +^^^^^^^^^^^^^^^ + +The current implementation in Clang does not sign pointers to ordinary data by +default. For a partial explanation of the reasoning behind this, see the +`Theory of Operation`_ section. + +A specific data pointer which is more security-sensitive than most can be +signed using the `__ptrauth qualifier`_ or using the ``<ptrauth.h>`` +intrinsics. + +C function pointers +^^^^^^^^^^^^^^^^^^^ + +The C standard imposes restrictions on the representation and semantics of +function pointer types which make it difficult to achieve satisfactory +signature diversity in the default language rules. See `Attacks on pointer +authentication`_ for more information about signature diversity. Programmers +should strongly consider using the ``__ptrauth`` qualifier to improve the +protections for important function pointers, such as the components of of +a hand-rolled "v-table"; see the section on the `__ptrauth qualifier`_ for +details. + +The value of a pointer to a C function includes a signature, even when the +value is cast to a non-function-pointer type like ``void*`` or ``intptr_t``. On +implementations that use high bits to store the signature, this means that +relational comparisons and hashes will vary according to the exact signature +value, which is likely to change between executions of a program. In some +implementations, it may also vary based on the exact function pointer type. + +Null pointers +^^^^^^^^^^^^^ + +In principle, an implementation could derive the signed null pointer value +simply by applying the standard signing algorithm to the raw null pointer +value. However, for likely signing algorithms, this would mean that the signed +null pointer value would no longer be statically known, which would have many +negative consequences. For one, it would become substantially more expensive +to emit null pointer values or to perform null-pointer checks. For another, +the pervasive (even if technically unportable) assumption that null pointers +are bitwise zero would be invalidated, making it substantially more difficult +to adopt pointer authentication, as well as weakening common optimizations for +zero-initialized memory such as the use of ``.bzz`` sections. Therefore it is +beneficial to treat null pointers specially by giving them their usual +representation. On AArch64, this requires additional code when working with +possibly-null pointers, such as when copying a pointer field that has been +signed with address diversity. + +While this representation of nulls is the safest option for the general case, +there are some situations in which a null pointer may have important semantic +or security impact. For that purpose clang has the concept of a pointer +authentication schema that signs and authenticates null values. + +Return addresses and frame pointers +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The current implementation in Clang implicitly signs both return addresses and +frame pointers. While these values are technically implementation details of +a function, there are some important libraries and development tools which rely +on manually walking the chain of stack frames. These tools must be updated to +correctly account for pointer authentication, either by stripping signatures +(if security is not important for the tool, e.g. if it is capturing a stack +trace during a crash) or properly authenticating them. More information about +how these values are signed is available in the `Language ABI`_ section. + +C++ virtual functions +^^^^^^^^^^^^^^^^^^^^^ + +The current implementation in Clang signs virtual function pointers with +a discriminator derived from the full signature of the overridden method, +including the method name and parameter types. It is possible to write C++ +code that relies on v-table layout remaining constant despite changes to +a method signature; for example, a parameter might be a ``typedef`` that +resolves to a different type based on a build setting. Such code violates +C++'s One Definition Rule (ODR), but that violation is not normally detected; +however, pointer authentication will detect it. + Language Extensions ~~~~~~~~~~~~~~~~~~~ @@ -276,10 +398,21 @@ Feature Testing Whether the current target uses pointer authentication can be tested for with a number of different tests. -- ``__has_feature(ptrauth_intrinsics)`` is true if ``<ptrauth.h>`` provides its - normal interface. This may be true even on targets where pointer +- ``__has_extension(ptrauth_intrinsics)`` is true if ``<ptrauth.h>`` provides + its normal interface. This may be true even on targets where pointer authentication is not enabled by default. +- ``__has_extension(ptrauth_returns)`` is true if the target uses pointer + authentication to protect return addresses. + +- ``__has_extension(ptrauth_calls)`` is true if the target uses pointer + authentication to protect indirect branches. This implies + ``__has_extension(ptrauth_returns)`` and + ``__has_extension(ptrauth_intrinsics)``. + +Clang provides several other tests only for historical purposes; for current +purposes they are all equivalent to ``ptrauth_calls``. + __ptrauth Qualifier ^^^^^^^^^^^^^^^^^^^ @@ -293,6 +426,11 @@ type, either to a function or to an object, or a pointer sized integer. It currently cannot be an Objective-C pointer type, a C++ reference type, or a block pointer type; these restrictions may be lifted in the future. +The current implementation in Clang is known to not provide adequate safety +guarantees against the creation of `signing oracles`_ when assigning data +pointers to ``__ptrauth``-qualified gl-values. See the section on `safe +derivation`_ for more information. + The qualifier's operands are as follows: - ``key`` - an expression evaluating to a key value from ``<ptrauth.h>``; must @@ -327,6 +465,54 @@ a discriminator determined as follows: is ``ptrauth_blend_discriminator(&x, discriminator)``; see `ptrauth_blend_discriminator`_. +Non-triviality from address diversity ++++++++++++++++++++++++++++++++++++++ + +Address diversity must impose additional restrictions in order to allow the +implementation to correctly copy values. In C++, a type qualified with address +diversity is treated like a class type with non-trivial copy/move constructors +and assignment operators, with the usual effect on containing classes and +unions. C does not have a standard concept of non-triviality, and so we must +describe the basic rules here, with the intention of imitating the emergent +rules of C++: + +- A type may be **non-trivial to copy**. + +- A type may also be **illegal to copy**. Types that are illegal to copy are + always non-trivial to copy. + +- A type may also be **address-sensitive**. + +- A type qualified with a ``ptrauth`` qualifier that requires address diversity + is non-trivial to copy and address-sensitive. + +- An array type is illegal to copy, non-trivial to copy, or address-sensitive + if its element type is illegal to copy, non-trivial to copy, or + address-sensitive, respectively. + +- A struct type is illegal to copy, non-trivial to copy, or address-sensitive + if it has a field whose type is illegal to copy, non-trivial to copy, or + address-sensitive, respectively. + +- A union type is both illegal and non-trivial to copy if it has a field whose + type is non-trivial or illegal to copy. + +- A union type is address-sensitive if it has a field whose type is + address-sensitive. + +- A program is ill-formed if it uses a type that is illegal to copy as + a function parameter, argument, or return type. + +- A program is ill-formed if an expression requires a type to be copied that is + illegal to copy. + +- Otherwise, copying a type that is non-trivial to copy correctly copies its + subobjects. + +- Types that are address-sensitive must always be passed and returned + indirectly. Thus, changing the address-sensitivity of a type may be + ABI-breaking even if its size and alignment do not change. + ``<ptrauth.h>`` ~~~~~~~~~~~~~~~ @@ -433,7 +619,7 @@ Produce a signed pointer for the given raw pointer without applying any authentication or extra treatment. This operation is not required to have the same behavior on a null pointer that the language implementation would. -This is a treacherous operation that can easily result in signing oracles. +This is a treacherous operation that can easily result in `signing oracles`_. Programs should use it seldom and carefully. ``ptrauth_auth_and_resign`` @@ -454,7 +640,29 @@ a null pointer that the language implementation would. The code sequence produced for this operation must not be directly attackable. However, if the discriminator values are not constant integers, their computations may still be attackable. In the future, Clang should be enhanced -to guaranteed non-attackability if these expressions are safely-derived. +to guaranteed non-attackability if these expressions are +:ref:`safely-derived<Safe derivation>`. + +``ptrauth_auth_function`` +^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: c + + ptrauth_auth_function(pointer, key, discriminator) + +Authenticate that ``pointer`` is signed with ``key`` and ``discriminator`` and +re-sign it to the standard schema for a function pointer of its type. + +``pointer`` must have function pointer type. The result will have the same +type as ``pointer``. This operation is not required to have the same behavior +on a null pointer that the language implementation would. + +This operation makes the same attackability guarantees as +``ptrauth_auth_and_resign``. + +If this operation appears syntactically as the function operand of a call, +Clang guarantees that the call will directly authenticate the function value +using the given schema rather than re-signing to the standard schema. ``ptrauth_auth_data`` ^^^^^^^^^^^^^^^^^^^^^ @@ -500,7 +708,884 @@ type. Implementations are not required to make all bits of the result equally significant; in particular, some implementations are known to not leave meaningful data in the low bits. +Standard ``__ptrauth`` qualifiers +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``<ptrauth.h>`` additionally provides several macros which expand to +``__ptrauth`` qualifiers for common ABI situations. + +For convenience, these macros expand to nothing when pointer authentication is +disabled. + +These macros can be found in the header; some details of these macros may be +unstable or implementation-specific. + + +Theory of Operation +------------------- + +The threat model of pointer authentication is as follows: + +- The attacker has the ability to read and write to a certain range of + addresses, possibly the entire address space. However, they are constrained + by the normal rules of the process: for example, they cannot write to memory + that is mapped read-only, and if they access unmapped memory it will trigger + a trap. + +- The attacker has no ability to add arbitrary executable code to the program. + For example, the program does not include malicious code to begin with, and + the attacker cannot alter existing instructions, load a malicious shared + library, or remap writable pages as executable. If the attacker wants to get + the process to perform a specific sequence of actions, they must somehow + subvert the normal control flow of the process. + +In both of the above paragraphs, it is merely assumed that the attacker's +*current* capabilities are restricted; that is, their current exploit does not +directly give them the power to do these things. The attacker's immediate goal +may well be to leverage their exploit to gain these capabilities, e.g. to load +a malicious dynamic library into the process, even though the process does not +directly contain code to do so. + +Note that any bug that fits the above threat model can be immediately exploited +as a denial-of-service attack by simply performing an illegal access and +crashing the program. Pointer authentication cannot protect against this. +While denial-of-service attacks are unfortunate, they are also unquestionably +the best possible result of a bug this severe. Therefore, pointer authentication +enthusiastically embraces the idea of halting the program on a pointer +authentication failure rather than continuing in a possibly-compromised state. + +Pointer authentication is a form of control-flow integrity (CFI) enforcement. +The basic security hypothesis behind CFI enforcement is that many bugs can only +be usefully exploited (other than as a denial-of-service) by leveraging them to +subvert the control flow of the program. If this is true, then by inhibiting or +limiting that subversion, it may be possible to largely mitigate the security +consequences of those bugs by rendering them impractical (or, ideally, +impossible) to exploit. + +Every indirect branch in a program has a purpose. Using human intelligence, a +programmer can describe where a particular branch *should* go according to this +purpose: a ``return`` in ``printf`` should return to the call site, a particular +call in ``qsort`` should call the comparator that was passed in as an argument, +and so on. But for CFI to enforce that every branch in a program goes where it +*should* in this sense would require CFI to perfectly enforce every semantic +rule of the program's abstract machine; that is, it would require making the +programming environment perfectly sound. That is out of scope. Instead, the +goal of CFI is merely to catch attempts to make a branch go somewhere that its +obviously *shouldn't* for its purpose: for example, to stop a call from +branching into the middle of a function rather than its beginning. As the +information available to CFI gets better about the purpose of the branch, CFI +can enforce tighter and tighter restrictions on where the branch is permitted to +go. Still, ultimately CFI cannot make the program sound. This may help explain +why pointer authentication makes some of the choices it does: for example, to +sign and authenticate mostly code pointers rather than every pointer in the +program. Preventing attackers from redirecting branches is both particularly +important and particularly approachable as a goal. Detecting corruption more +broadly is infeasible with t... [truncated] `````````` </details> https://github.com/llvm/llvm-project/pull/152596 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits