[PATCH] D64128: [CodeGen] Generate llvm.ptrmask instead of inttoptr(and(ptrtoint, C)) if possible.

Florian Hahn via Phabricator via cfe-commits Thu, 04 Jul 2019 07:27:06 -0700

fhahn added a comment.

Thanks for the quick responses and the helpful comments. Thank you very much 
Hal, for summarizing the argument from previous discussions. My initial 
understanding indeed was that by generating ptrmask directly for C/C++ 
expressions, we can circumvent the issues that come with ptrtoint/inttoptr in 
LLVM.


One key point that might not be too clear is that the question should be 
whether `(T*) ((intptr_t) x & N)` points to the same underlying object as 
`x`,// iff the mask `N` preserves all 'relevant' bits of the pointer `x`//. I 
am not sure if 'relevant' bits is the best term, but I use it to refer to all 
bits that do not have to be zero due to alignment requirements or pointer size 
restrictions. With that in mind, let me try to cover the possible cases in 
terms of C++'s  safely-derived pointers, depending on `x`. (I'm referencing 
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3690.pdf)

1. if `x` is a safely-derived pointer, then mask is a no-op and the result of 
the expression is a safely-derived pointer; as `x` was safely-derived, all bits 
that are masked out must already be 0, so according to 3.7.4.3.3, 
`reinterpret_cast<void*>(x)` should be equal to 
`reinterpret_cast<void*>(((intptr_t) x & N))`.

2. if `x` is not a safely-derived pointer, but it becomes one after masking: 
then `x` must be the result of a series of bitwise operations, that only modify 
the bits masked out later by `N`. Otherwise the whole series of bitwise 
operations including the masking would violate `3.7.4.3.3 - the result of an 
additive or bitwise operation, one of whose operands is an integer 
representation of a safely-derived pointer value P, if that result converted by 
reinterpret_cast<void*> would compare equal to a safely-derived pointer 
computable from reinterpret_cast<void*>(P)`

3. if `x` is not a safely-derived pointer and the mask does not turn it into a 
safely-derived pointer: in that case, the masking should again not change the 
safely-derived property, and both would be invalid under strict pointer safety.

I think the key case is 2., where the mask operation is the last step in a 
series of bitwise operations, taking an integer representation of a 
safely-derived pointer value `P` and after masking we get `P` again. E.g. 
packing/unpacking bits of a tagged pointer `(P | 1) & ~1`. After writing all 
that down, there seems to be one problem though: technically we have a series 
of bitwise operations and the intermediate values are not integer values of 
safely-derived pointers. One could argue that the bitwise operations together 
cancel out each other and are a no-op, resulting in the original pointer.

Does this summary make sense?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64128/new/

https://reviews.llvm.org/D64128



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D64128: [CodeGen] Generate llvm.ptrmask instead of inttoptr(and(ptrtoint, C)) if possible.

Reply via email to