evgeny777 added inline comments.
================ Comment at: llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp:1678 + // breaks any uses on assumes. + if (TypeIdMap.count(TypeId)) + continue; ---------------- tejohnson wrote: > evgeny777 wrote: > > tejohnson wrote: > > > evgeny777 wrote: > > > > tejohnson wrote: > > > > > evgeny777 wrote: > > > > > > I don't think, I understand this. > > > > > > It looks like that if (a) we have type.test in the module and (b) > > > > > > we don't have vtable definition with corresponding type metadata in > > > > > > the same module then we'll remove type.test/assume sequence before > > > > > > even getting to ICP. This seems strange in the context of previous > > > > > > discussion because virtual function may be called in different > > > > > > module from one where vtable is defined, like so: > > > > > > ``` > > > > > > struct A { virtual int foo() { return 42; } }; > > > > > > > > > > > > // calls pA->foo > > > > > > extern int callFoo(A *pA); > > > > > > int main() { A a; return callFoo(&a); } > > > > > > ``` > > > > > > > > > > > We will only be able to do ICP this way if we have the target vtable > > > > > in the module (similar to how we currently can only do ICP if the > > > > > target function is in the module). I anticipate doing something > > > > > similar for vtable defs as to what we do for virtual functions, where > > > > > a fake call edge is added to the summary based on the value profile > > > > > results to ensure they are imported before the LTO backend ICP. > > > > > We will only be able to do ICP this way if we have the target vtable > > > > > in the module (similar to how we currently can only do ICP if the > > > > > target function is in the module). > > > > > > > > I was thinking of slightly different approach: it seems you have > > > > required type information in combined index together with type name in > > > > the module, so you probably can add external declarations of required > > > > vtables (this may require promotion) to the module in the ICP pass and > > > > run ICP over them. Do you think this can work? > > > Possibly, but we'd still need to have type metadata on those vtable > > > declarations, because the type metadata provides the address point offset > > > which is needed in the comparison sequence. > > > Possibly, but we'd still need to have type metadata on those vtable > > > declarations, because the type metadata provides the address point offset > > > which is needed in the comparison sequence. > > > > I think it's stored in the index in VFuncId structures. Isn't it? > > I think it's stored in the index in VFuncId structures. Isn't it? > > No, that holds the offset from the address point to the virtual function > within the vtable def, not the address point offset itself, which is what we > need from the type metadata. But actually, we need both offsets: > > Consider the following example: > > ``` > class A { > virtual void foo(); > } > > void bar(A *a) { > a->foo(); > } > ``` > > The indirect call sequence will look like (not showing the type test for > simplicity): > > ``` > %0 = bitcast %class.A* %a to i32 (%class.A*)*** > %vtable = load i32 (%class.A*)**, i32 (%class.A*)*** %0 > %1 = load i32 (%class.A*)*, i32 (%class.A*)** %vtable > %call = tail call i32 %1(%class.A* %a) > ``` > > With type profiling we will profile the value of %vtable, and figure out that > it is associated with the symbol for A's vtable (the profiling support uses > symbol table info for this), which is: > > ``` > @_ZTV1A = dso_local unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* > null, i8* bitcast ({ i8*, i8* }* @_ZTI1A to i8*), i8* bitcast (i32 > (%class.A*)* @_ZN1A3fooEv to i8*)] }, align 8, !type !0 > > !0 = !{i64 16, !"_ZTS1A"} > ``` > > When we promote the call using the vtable profile results, we need to add the > address point offset (16) from the type metadata to the profiled result which > will be the symbol @_ZTV1A, before comparing against %vtable. > > We then need to look at the IR and compute the offset to the function address > used there (0 in the above IR), and add it to the address point offset, > resulting in 16 here, looking in the vtable def to see what function is at > the combined offset (_ZN1A3fooEv here), so that we insert a direct call to > that function. > > For the address point offset, we only have it in the summary for index-only > WPD (TypeIdCompatibleVtableMap). However, I don't want to restrict the ICP > improvements to that build mode. > > For the latter, the VFuncId does summarize the offsets, but we can compute it > from the IR as I described above (it's the amount added to %vtable before > loading the virtual function pointer) and don't need the summary. And in any > case the VFuncId doesn't tell us which function is at the aggregate offset, > we need the vtable def to tell us that (or the vTableFuncs array saved on the > GlobalVarSummary, again only in index-only WPD mode). > > Also, I believe we can support this type profiling based ICP in non-ThinLTO > mode as well - with my recent changes to make WPD rely on vcall visibility > info, we should be able to insert both the type tests and the type metadata > into the code stream in non-ThinLTO mode. In that case, similar to the way > ICP currently behaves for target functions, the def would have to happen to > already be in the current module for the vtable compare promotion to be > occur. > > For the above reasons, I'd prefer to rely on the type metadata on vtable > defs, with the defs imported as much as possible in ThinLTO mode. Okay, got it. One question: why do you insist on removing such type.test/assume sequences here, instead of simply ignoring them in ICP and eliminating them altogether in LowerTypeTests? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D73242/new/ https://reviews.llvm.org/D73242 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits