tejohnson marked an inline comment as done.
tejohnson added inline comments.


================
Comment at: llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp:1678
+    // breaks any uses on assumes.
+    if (TypeIdMap.count(TypeId))
+      continue;
----------------
evgeny777 wrote:
> tejohnson wrote:
> > evgeny777 wrote:
> > > tejohnson wrote:
> > > > evgeny777 wrote:
> > > > > tejohnson wrote:
> > > > > > evgeny777 wrote:
> > > > > > > I don't think, I understand this.
> > > > > > > It looks like that if (a) we have type.test in the module and (b) 
> > > > > > > we don't have vtable definition with corresponding type metadata 
> > > > > > > in the same module then we'll remove type.test/assume sequence 
> > > > > > > before even getting to ICP. This seems strange in the context of 
> > > > > > > previous discussion because virtual function may be called in 
> > > > > > > different module from one where vtable is defined, like so:
> > > > > > > ```
> > > > > > > struct A { virtual int foo() { return 42; } };
> > > > > > > 
> > > > > > > // calls pA->foo
> > > > > > > extern int callFoo(A *pA);
> > > > > > > int main() { A a; return callFoo(&a); }
> > > > > > > ```
> > > > > > > 
> > > > > > We will only be able to do ICP this way if we have the target 
> > > > > > vtable in the module (similar to how we currently can only do ICP 
> > > > > > if the target function is in the module). I anticipate doing 
> > > > > > something similar for vtable defs as to what we do for virtual 
> > > > > > functions, where a fake call edge is added to the summary based on 
> > > > > > the value profile results to ensure they are imported before the 
> > > > > > LTO backend ICP.
> > > > > > We will only be able to do ICP this way if we have the target 
> > > > > > vtable in the module (similar to how we currently can only do ICP 
> > > > > > if the target function is in the module).
> > > > > 
> > > > > I was thinking of slightly different approach: it seems you have 
> > > > > required type information in combined index together with type name 
> > > > > in the module, so you probably can add external declarations of 
> > > > > required vtables (this may require promotion) to the module in the 
> > > > > ICP pass and run ICP over them. Do you think this can work?
> > > > Possibly, but we'd still need to have type metadata on those vtable 
> > > > declarations, because the type metadata provides the address point 
> > > > offset which is needed in the comparison sequence.
> > > > Possibly, but we'd still need to have type metadata on those vtable 
> > > > declarations, because the type metadata provides the address point 
> > > > offset which is needed in the comparison sequence.
> > > 
> > > I think it's stored in the index in VFuncId structures. Isn't it?
> > > I think it's stored in the index in VFuncId structures. Isn't it?
> > 
> > No, that holds the offset from the address point to the virtual function 
> > within the vtable def, not the address point offset itself, which is what 
> > we need from the type metadata. But actually, we need both offsets:
> > 
> > Consider the following example:
> > 
> > ```
> > class A {
> >    virtual void foo();
> > }
> > 
> > void bar(A *a) {
> >    a->foo();
> > }
> > ```
> > 
> > The indirect call sequence will look like (not showing the type test for 
> > simplicity):
> > 
> > ```
> >  %0 = bitcast %class.A* %a to i32 (%class.A*)***
> >   %vtable = load i32 (%class.A*)**, i32 (%class.A*)*** %0
> >   %1 = load i32 (%class.A*)*, i32 (%class.A*)** %vtable
> >   %call = tail call i32 %1(%class.A* %a)
> > ```
> > 
> > With type profiling we will profile the value of %vtable, and figure out 
> > that it is associated with the symbol for A's vtable (the profiling support 
> > uses symbol table info for this), which is:
> > 
> > ```
> > @_ZTV1A = dso_local unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* 
> > null, i8* bitcast ({ i8*, i8* }* @_ZTI1A to i8*), i8* bitcast (i32 
> > (%class.A*)* @_ZN1A3fooEv to i8*)] }, align 8, !type !0
> > 
> > !0 = !{i64 16, !"_ZTS1A"}
> > ```
> > 
> > When we promote the call using the vtable profile results, we need to add 
> > the address point offset (16) from the type metadata to the profiled result 
> > which will be the symbol @_ZTV1A, before comparing against %vtable.
> > 
> > We then need to look at the IR and compute the offset to the function 
> > address used there (0 in the above IR), and add it to the address point 
> > offset, resulting in 16 here, looking in the vtable def to see what 
> > function is at the combined offset (_ZN1A3fooEv here), so that we insert a 
> > direct call to that function.
> > 
> > For the address point offset, we only have it in the summary for index-only 
> > WPD (TypeIdCompatibleVtableMap). However, I don't want to restrict the ICP 
> > improvements to that build mode.
> > 
> > For the latter, the VFuncId does summarize the offsets, but we can compute 
> > it from the IR as I described above (it's the amount added to %vtable 
> > before loading the virtual function pointer) and don't need the summary. 
> > And in any case the VFuncId doesn't tell us which function is at the 
> > aggregate offset, we need the vtable def to tell us that (or the 
> > vTableFuncs array saved on the GlobalVarSummary, again only in index-only 
> > WPD mode).
> > 
> > Also, I believe we can support this type profiling based ICP in non-ThinLTO 
> > mode as well - with my recent changes to make WPD rely on vcall visibility 
> > info, we should be able to insert both the type tests and the type metadata 
> > into the code stream in non-ThinLTO mode. In that case, similar to the way 
> > ICP currently behaves for target functions, the def would have to happen to 
> > already be in the current module for the vtable compare promotion to be 
> > occur. 
> > 
> > For the above reasons, I'd prefer to rely on the type metadata on vtable 
> > defs, with the defs imported as much as possible in ThinLTO mode.
> Okay, got it. One question: why do you insist on removing such 
> type.test/assume sequences here, instead of simply ignoring them in ICP and 
> eliminating them altogether in LowerTypeTests?
ICP isn't the issue, it is what will happen in LTT. Without any associated type 
metadata, as noted in my comment here, LTT will think they are Unsat and lower 
to false, which is incorrect and results in an llvm.assume(false) which will 
turn into an unreachable, resulting in runtime failures.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73242/new/

https://reviews.llvm.org/D73242



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to