steakhal added inline comments.
================ Comment at: clang/test/Analysis/reinterpret-cast-pointer-to-member.cpp:29-31 + Some some; + some.*sf = 14; + clang_analyzer_eval(some.*sf == 14); // expected-warning{{UNKNOWN}} ---------------- RedDocMD wrote: > steakhal wrote: > > RedDocMD wrote: > > > steakhal wrote: > > > > The assignment is actually UB. > > > > TBH I don't know how to test such behavior xD > > > > Same for the next example. > > > > > > > > Shouldn't it return `undef` for reading via an invalid member pointer? > > > I don't quite know what is `undef`. Is it a special macro or something? > > > Using an invalid member pointer is definitely UB, but I also need to show > > > that casting to an invalid pointer is properly handled because that is > > > not UB. I guess I will remove the member access and claim that since > > > there was no crash, it is okay and has been handled appropriately. > > By `undef`, I was referring to the `clang::ento::UndefinedVal` - which > > usually represents values that shouldn't be used, not even read. For > > example, an uninitialized variable has undefined value. Or an out-of-bound > > memory access also produces undefined value. There are a few checkers that > > are hunting for such undefined values like CallAndMessageChecker, > > UndefBranchChecker, UndefinedArraySubscriptChecker, > > UndefinedAssignmentChecker, UndefResultChecker. > > > > Probably removing the member accesses is the best you can do. > > > > > but I also need to show that casting to an invalid pointer is properly > > > handled because that is not UB > > In this test, I can't see where you cast the member pointer back to make it > > valid again. > > In this test, I can't see where you cast the member pointer back to make it > > valid again. > > I meant that `sf` when used will lead to undefined behaviour. But the > statement `int Some::*sf = reinterpret_cast<int Some::*>(ddf);` isn't > undefined, is it? AFAIK, `reinterpret_cast` is the C++ equivalent of the C > unchecked cast - You can cast pointers as you wish but don't bet on the > result. > On the other hand, pointer-to-members are unlike regular pointers, so there > may be a catch. But all I really require is that the statement `int Some::*sf > = reinterpret_cast<int Some::*>(ddf);` to not drive the Static Analyzer crazy > - it simply gives up on trying to reason about `sf`. The test would be to > confirm that. So removing the member access should solve it > I meant that sf when used will lead to undefined behaviour It depends on how you define //use//. It's fine to make copy of the pointer `sf`, but dereferencing `sf` is not. My problem, as I carefully highlighted, is about the dereference. You can do exactly 2 things with the invalid `sf` pointer: - cast it back to the original type, or - simply not use (not dereference) :D The result of the dereference of an invalid pointer could be modelled as either `UnknownVal` or `UndefinedVal`. The latter would be more accurate, but the former is also a valid //approximation//. ================ Comment at: clang/test/Analysis/reinterpret-cast-pointer-to-member.cpp:52-62 +void testMultiple() { + int F::*f = &F::field; + int A::*a = reinterpret_cast<int A::*>(f); + int C::*c = reinterpret_cast<int C::*>(f); + A aobj; + C cobj; + aobj.*a = 13; ---------------- RedDocMD wrote: > steakhal wrote: > > RedDocMD wrote: > > > steakhal wrote: > > > > Wait a minute. It's not how it works. > > > > How I imagine member pointers, they are just offsets. > > > > `&F::field` is notionally equivalent with `offsetof(F, field)`. That > > > > being said, You can not apply this member pointer to any object besides > > > > `F`. > > > > Imagine if the classes of the inheritance tree would have other fields > > > > as well. > > > > Then the `offsetof(T, field)` would be different for `F`, and `C`. > > > > > > > > This example demonstrates that both of these member pointer > > > > dereferences are UB. > > > > https://godbolt.org/z/15sMEP > > > > It returns different values depending on the optimization level, which > > > > is a clear sign of UB. > > > > BTW this issue is closely related to strict aliasing. > > > The member access on A is definitely UB, I guess I will do what I > > > proposed in the `Some` case. > > > I don't think the other one is. Consider the following: > > > ``` > > > struct A {}; > > > struct B : public A {}; > > > struct C { > > > int field; > > > }; > > > struct D : public C {}; > > > struct E : public B, public D {}; > > > struct F : public E {}; > > > > > > int main() { > > > int F::* ff = &F::field; > > > int C::* cf1 = static_cast<int C::*>(ff); > > > int C::* cf2 = reinterpret_cast<int C::*>(ff); > > > C c; > > > c.*cf1 = 10; > > > c.*cf2 = 10; > > > return 0; > > > } > > > ``` > > > `cf1` and `cf2` are the same thing, except that they are declared > > > differently (one via `static_cast`, other via `reinterpret_cast`). If we > > > look at the AST (truncated to the first three lines of main): > > > ``` > > > CompoundStmt 0x1a4fe18 <col:12, line:18:1> > > > |-DeclStmt 0x1a4f3a8 <line:11:3, col:26> > > > | `-VarDecl 0x1a21078 <col:3, col:21> col:12 used ff 'int F::*' cinit > > > | `-ImplicitCastExpr 0x1a4f378 <col:17, col:21> 'int F::*' > > > <BaseToDerivedMemberPointer (E -> D -> C)> > > > | `-UnaryOperator 0x1a4f360 <col:17, col:21> 'int C::*' prefix > > > '&' cannot overflow > > > | `-DeclRefExpr 0x1a4f2f8 <col:18, col:21> 'int' lvalue Field > > > 0x1a207d0 'field' 'int' > > > |-DeclStmt 0x1a4f560 <line:12:3, col:43> > > > | `-VarDecl 0x1a4f428 <col:3, col:42> col:12 used cf1 'int C::*' cinit > > > | `-CXXStaticCastExpr 0x1a4f518 <col:18, col:42> 'int C::*' > > > static_cast<int struct C::*> <DerivedToBaseMemberPointer (E -> D -> C)> > > > | `-ImplicitCastExpr 0x1a4f500 <col:40> 'int F::*' > > > <LValueToRValue> part_of_explicit_cast > > > | `-DeclRefExpr 0x1a4f498 <col:40> 'int F::*' lvalue Var > > > 0x1a21078 'ff' 'int F::*' > > > |-DeclStmt 0x1a4f6e8 <line:13:3, col:48> > > > | `-VarDecl 0x1a4f5c8 <col:3, col:47> col:12 used cf2 'int C::*' cinit > > > | `-CXXReinterpretCastExpr 0x1a4f6b8 <col:18, col:47> 'int C::*' > > > reinterpret_cast<int struct C::*> <ReinterpretMemberPointer> > > > | `-ImplicitCastExpr 0x1a4f6a0 <col:45> 'int F::*' > > > <LValueToRValue> part_of_explicit_cast > > > | `-DeclRefExpr 0x1a4f638 <col:45> 'int F::*' lvalue Var > > > 0x1a21078 'ff' 'int F::*' > > > ``` > > > Notice how the `static_cast` figures out the path to the correct > > > subobject. This is how member-pointers are handled as far as I can tell. > > > Unfortunately, Stroustrup is surprisingly scant when it comes to this > > > topic. I am trying to dig through the Standard. > > I tried to highlight that the classes could have non-static data members > > which could cause your assumption about `cf1 == cf2` not to hold anymore. > > See my previously attached godbolt link. > > > > `static_cast<T>(p)` makes sure that the //offset// that the member pointer > > `p` represents is updated accordingly to match the //offset// of the > > `field` member within the new type `T`. Usually, it involves some > > addition/subtraction to accommodate this. > > While `static_cast<T>(p)` does not update the value, just reinterprets it > > in a different way - which usually results in an invalid pointer though and > > a bunch of subtle rules kick in such as the type similarity, pointer > > interchangeability, which in general known as //strict-aliasing//. > > While static_cast<T>(p) does not update the value, just reinterprets it in > > a different way - which usually results in an invalid pointer though and > > a bunch of subtle rules kick in such as the type similarity, pointer > > interchangeability, which in general known as strict-aliasing. > Presumably you mean `reinterpret_cast`. > > Ah, I see. This makes it glaringly obvious: [[ https://godbolt.org/z/f6vs77 | > Another Godbolt link ]] > So I guess handling `reinterpret_cast` for pointer-to-members pointless. I > guess the idea will be to simply not analyze pointer-to-members which have > been obtained through `reinterpret_cast.` > Presumably you mean reinterpret_cast. Yes, sorry for copy-pasting too much xD > So I guess handling reinterpret_cast for pointer-to-members pointless. I don't know if it's really pointless, but hard to model precisely for sure. > I guess the idea will be to simply not analyze pointer-to-members which have > been obtained through `reinterpret_cast`. I guess, you could just return unknown? I don't know. This area is not my expertise, I just wanted to get you going with my observations. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96976/new/ https://reviews.llvm.org/D96976 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits