[PATCH] D142584: [CodeGen] Add a boolean flag to `Address::getPointer` and `Lvalue::getPointer` that indicates whether the pointer is known not to be null

John McCall via Phabricator via cfe-commits Wed, 25 Jan 2023 19:58:25 -0800

rjmccall added a comment.

In D142584#4081345 <https://reviews.llvm.org/D142584#4081345>, @efriedma wrote:


> I'm having a little trouble following what the meaning of an "Address" is 
> with this patch.  The `Pointer` member refers to an encoded value, and 
> calling getPointer() emits IR to decode it?  Could use a few comments to

explain the API.

That's something of a future direction w.r.t this patch, but it's reasonable to 
talk about it, yeah.

Pointer authentication includes the ability to have signed data pointers.  
Function pointers can only be used opaquely in C, but data pointers are subject 
to a lot of operations that boil down to pointer arithmetic, both as part of 
l-value accesses (such as `signedPtr->array[i] = 5`) and just in abstract 
pointer-typed expressions (such as `&ptr->array[i]`).  For l-values that are 
ultimately accessed, we really want to make sure that we do the authentication 
as close as possible to the actual access rather than authenticating it 
immediately and then having a raw, attackable pointer sitting around while we 
set up the access.  For pointer expressions in general, if they ultimately need 
to produce another signed pointer (e.g. we assign into a `__ptrauth`-qualified 
l-value), for similar reasons we really want to set up all the arithmetic ahead 
of time for a secure auth-add-and-resign sequence instead of authenticating and 
having raw, attackable pointers sitting around as we do the arithmetic.  We 
could do all that with a specialized pointer-with-signature emitter that covers 
literally every kind of pointer arithmetic, but that requires duplication of a 
*lot* of code, especially because we have a lot of sanitizers and other special 
logic also tied to those operations.  It seemed much less invasive to add a 
state to `Address` where it can be a pair of a signed pointer and an offset, 
then make a few basic GEP-formation operations accumulate into the offset when 
the pointer is in that state.  Of course, converting an `Address` into a normal 
pointer then becomes a non-trivial operation that needs a CGF.

Pointer authentication doesn't sign null pointers, so we have to do some 
explicit extra checks as part of this when the pointer is possibly null, and 
this patch is trying to make that easier to avoid.



================
Comment at: clang/lib/CodeGen/Address.h:26
+// Indicates whether a pointer is known not to be null.
+enum class KnownNonNull_t { False, True };
+
----------------
The usual idiom for this kind of boolean enum is that you just use an unscoped 
enum (not `enum class`), and you spell the enumerators something that reads 
well where it'll be used.  So you'd make the enumerators `KnownNotNull` and (I 
guess) something like `NotKnownNotNull` or `NullnessUnknown`.


================
Comment at: clang/lib/CodeGen/Address.h:97
+  llvm::Value *
+  getPointer(KnownNonNull_t KnownNonNull = KnownNonNull_t::False) const {
     assert(isValid());
----------------
Apologies if this rehashes a conversation we had earlier, but does it make more 
sense to track this internally in `Address`?  Some `Address` sources definitely 
known that they're non-null, and the places where we only know that 
*contextually* could definitely just pass it down to e.g. 
`EmitPointerWithAlignment`.  That would make it a less invasive change for all 
the clients of `getPointer()`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142584/new/

https://reviews.llvm.org/D142584

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D142584: [CodeGen] Add a boolean flag to `Address::getPointer` and `Lvalue::getPointer` that indicates whether the pointer is known not to be null

Reply via email to