On 8/8/07, Daniel Berlin <[EMAIL PROTECTED]> wrote: > I also haven't necessarily said what Ollie has proposed is a bad idea. > I have simply said the way he has come up with what he proposed is > not the way we should go about this. It may turn out he has come up > with exactly the representation we want (though I doubt this, for > various reasons). The specification given also doesn't even explain > where/how these operations can occur in GIMPLE, and what they do other > than "a C++ something something". > > Also given that someone already wrote a type-based devirtualizer that > worked fine, and i don't see how a points-to one is much work, I'd > like to see more justification for things like PTRMEM_PLUS_EXPR than > "hey, the C++ FE generates this internally".
OK. It sounds like I need to go into a lot more detail. The new nodes I've proposed aren't actually motivated by the C++ front end, but rather by a consideration of the semantics dictated by the C++ standard. Naturally, this gives rise to some similarity, but for instance, there is no PTRMEM_PLUS_EXPR in the C++ front end, and my definition of PTRMEM_CST disagrees with the current node of the same name. Let's walk through them: PTRMEM_TYPE Contains the types of the member (TREE_TYPE) and class (TYPE_PTRMEM_BASETYPE) of this pointer to member. This is hopefully self-explanatory. In the language of the C++ standard, it is the type of a "pointer to member of class TYPE_PTRMEM_BASETYPE of type TREE_TYPE." This is the type of PTRMEM_CST's, PTRMEM_PLUS_EXPR's, and various variable types (VAR_DECL, FIELD_DECL, PARM_DECL, etc.). PTRMEM_CST The C++ front end already has a PTRMEM_CST node. However, the existing node only contains a class (PTRMEM_CST_CLASS) and member (PTRMEM_CST_MEMBER), and is unable to represent an arbitrary pointer to member value. This is especially evident when dealing with multiple inheritance. Consider the following example: struct B { int f (); }; struct L : B {}; struct R : B {};; struct D : L, R {}; int (B::*pb)() = &B::f; int (L::*pl)() = pb; int (R::*pr)() = pb; int (D::*pd[2])() = { pl, pr }; In this case, pd[0] and pd[1] both have the same type and point to the same member of the same class (B::f), but they point to different base class instances of B. To represent this, we need an offset. Now, one might argue that rather than a numeric offset, we should point to the _DECL of the base class subobject, but that breaks down because the following is also legal: struct B {}; struct D : B { int f (); }; int (D::*pd)() = &D::f; int (B::*pb)() = static_cast<int (B::*)()>(pd); In this case, pb points to D::f in the derived class. Since there is no subobject to point to, we see that a numeric offset representation is required. This leads to the definition of PTRMEM_CST which I have adopted. Since the class type is already provided in its type, we store the member (TREE_PTRMEM_CST_MEMBER) and numeric offset (TREE_PTRMEM_CST_OFFSET). The member is one of NULL (for NULL pointers to members), a FIELD_DECL (for non-NULL pointers to data members), or a FUNCTION_DECL (for non-NULL pointers to member functions). I've chosen the offset value according to convenience. For NULL pointers to members, it's irrelevant. For pointers to data members, it's the offset of the member relative to the current class (as determined by any type conversions). For pointers to member functions, it's the offset to the this pointer which must be passed to the function. PTRMEM_PLUS_EXPR >From the discussion above, it's clear that type conversions on pointers to members require adjustments to the offsets (to fields or this pointers). We could handle this via CONVERT_EXPRs, but that has two shortcomings: (1) it requires the middle end to compute offsets to base class subobjects, and (2) as the first code example above illustrates, multiple CONVERT_EXPRs cannot be folded together. To work around these issues, I've implemented the PTRMEM_PLUS_EXPR. It's a binary expression which takes two arguments, a PTRMEM_TYPE object, and an integral offset expression. These can be nicely constant folded, either with other PTRMEM_PLUS_EXPRs or with PTRMEM_CSTs. There's also an added benefit when dealing with NULL pointers to members. Consider the following code: struct B { int a; }; struct L : B {}; struct R : B {};; struct D : L, R {}; int B::*pb = NULL; int L::*pl = pb; int R::*pr = pb; int D::*pd[2] = { pl, pr }; The C++ standard states that pd[0] == pd[1] since all NULL pointers to members of the same type compare equal. However, the current GCC implementation gets this wrong because the C++ front end implements pointer to data member via simple addition. In practice, it needs to check for NULL first. However, folding stacked conversions then requires optimizing code like: if (d != NULL_MARKER) d += offset1; if (d != NULL_MARKER) d += offset2; if (d != NULL_MARKER) d += offset3; to if (d!= NULL_MARKER) d += offset1 + offset2 + offset3; GCC's optimizer may well be smart enough to do this, but with PTRMEM_PLUS_EXPRs you get this for free even with optimization disabled. We simply fold the stacked PTRMEM_PLUS_EXPRs into a single expression with an operand of offset1+offset2+offset3 and add the NULL check as part of the RTL expansion. PTRMEM_REF This one's pretty straightforward. It takes a class expression and a pointer to member expression and returns a reference to the specified field or function. After all, we have to implement the .* and ->* operators somehow. :P This is the source of my current woes, as this may involve virtual function resolution, which can't be done with the information currently available to the middle end. Ollie