modimo marked 3 inline comments as done. modimo added a comment. In D36850#2964293 <https://reviews.llvm.org/D36850#2964293>, @tejohnson wrote:
> Good point on indirect calls. Rather than add a bit to the summary, can the > flags just be set conservatively in any function containing an indirect call > when we build the summaries initially? I think that would get the same effect. That could have an issue where A calls {indirect, B} and A gets propagated upon from B without knowledge that the indirect call exists. Right now I've got a FunFlags `hasUnknownCall` which marks these as non-propagatable. > For speculative devirtualization aka ICP, we will still be left with a > fallback indirect call, so it would still need to be treated conservatively. > The extra edges added for ICP promotion candidates shouldn't be a problem or > affect this. Ah good point. I was thinking it may pessimize the propagation because of having to process all of these edges this is a no-go because of the fallback. > Note that with class hierarchy analysis we can do better for virtual calls, > e.g. when -fwhole-program-vtables is enabled for whole program > devirtualization and we have the necessary whole program visibility on > vtables. We could potentially integrate WPD decision here. Even if we can't > find a single devirtualization target, we can compute the set of all possible > targets of virtual calls during the thin link and could use that information > to basically merge the attributes from all possible targets. But probably > best to leave that as a future refinement as it will take some additional > work to get that hooked up. We'd still need to be conservative for virtual > calls when we don't have the necessary type info (when > -fwhole-program-vtables not enabled or we don't have whole program visibility > on the vtable defs), or for non-virtual indirect calls. Agreed, it's an engineering problem more than anything. I ran an optimistic build (same revisions as before, release + noinline) where indirect and virtual calls were assumed to always propagate (thinlto_prop_optimistic) and the effect in Clang self-build at least is not too large: thinlto_base/ "dwarfehprepare.NumCleanupLandingPadsRemaining": 217515, "dwarfehprepare.NumNoUnwind": 299126, "dwarfehprepare.NumUnwind": 332785, thinlto_prop/ "dwarfehprepare.NumCleanupLandingPadsRemaining": 158372, "dwarfehprepare.NumNoUnwind": 420918, "dwarfehprepare.NumUnwind": 210870, thinlto_prop_optimistic/ "dwarfehprepare.NumCleanupLandingPadsRemaining": 154958, "dwarfehprepare.NumNoUnwind": 425893, "dwarfehprepare.NumUnwind": 205889, (425893-420918)/(420918-299126) = 4% boost over being conservative and correct. This may change in real workloads though so I added a `thinlto-funcattrs-optimistic-indirect` flag for easy measurement. ================ Comment at: llvm/lib/Transforms/IPO/FunctionAttrs.cpp:383 + // occur in a couple of circumstances: + // a. An internal function gets imported due to its caller getting + // imported, it becomes AvailableExternally ---------------- tejohnson wrote: > I'm not sure how this case could be happening as we haven't actually done the > importing that would create these new available externally copies yet - that > happens in the LTO backends, during the thin link we just add them to import > lists. I added the test funcattrs-prop-exported-internal.ll that demonstrates this. The summary has its internal linkage converted to external in [thinLTOResolvePrevailingInIndex](https://github.com/llvm/llvm-project/blob/92ce6db/llvm/lib/LTO/LTO.cpp#L436) then converted to AvailableExternally in [thinLTOResolvePrevailingGUID](https://github.com/llvm/llvm-project/blob/92ce6db/llvm/lib/LTO/LTO.cpp#L370). Currently being handled conservatively since a prevailing copy doesn't exist. ================ Comment at: llvm/lib/Transforms/IPO/FunctionAttrs.cpp:388 + // propagating on its caller. + // b. C++11 [temp.explicit]p10 can generate AvailableExternally + // definitions of explicitly instanced template declarations ---------------- tejohnson wrote: > There is no prevailing copy presumably because the prevailing copy is in a > native library being linked? I think these cases can be handled > conservatively. Yeah the prevailing copy is in the native binary. This is a [C++ specific feature](https://github.com/llvm/llvm-project/blob/92ce6db/clang/lib/AST/ASTContext.cpp#L10755) which has ODR and these are already being propagated/inlined from in pre-link. The current thinlink propagation implementation is conservative because a prevailing copy doesn't exist. Currently being handled conservatively since a prevailing copy doesn't exist. ================ Comment at: llvm/lib/Transforms/IPO/FunctionAttrs.cpp:419 + // Virtual calls are unknown so go conservative + if (!FS || FS->getTypeIdInfo()) + return CachedAttributes[VI]; ---------------- tejohnson wrote: > I think this checking for virtual calls will only work if > -fwhole-program-vtables is enabled for whole program devirtualization or CFI. > Otherwise we don't have the type tests that cause this to get populated. This > also won't detect non-virtual indirect calls. I see. I added `hasUnknownCall` as an explicit flag for all indirect calls that should capture both cases. ================ Comment at: llvm/lib/Transforms/IPO/FunctionAttrs.cpp:443 + // we'll keep the last one seen. + ODR = FS; + } else if (!Prevailing && !ODR) { ---------------- tejohnson wrote: > modimo wrote: > > tejohnson wrote: > > > Won't we still have a copy marked prevailing? Wondering if the weak > > > linkage cases can all be merged. > > Yeah, there will still be a copy that's prevailing. Reading through the > > linkage descriptions again and also those in > > `FunctionImportGlobalProcessing::getLinkage`: > > > > 1. I think with External/WeakODR/LinkOnceODR once the prevailing is found > > use that copy > > > > 2. For Weak/LinkOnce even with a prevailing copy I don't know if the copy > > ultimately used will be prevailing. I'm wondering if a native definition > > could be the victor in which case we just can't propagate off these > > functions. > > > > WDYT about (2)? For C++ at least these don't seem to really exist and > > testing with Clang self-build I'm not seeing this kick in anywhere. I added > > a flag to specifically disable this case so it's easy to test out the > > differences. > Since the linker which invokes this should have been passed all objects to > link, bitcode and native, it can do symbol resolution across all of them. So > if there is an overriding native strong symbol, it should see that and the > bitcode resolution would be non-prevailing and all bitcode copies marked dead > (in computeDeadSymbols). So I think the weak any and linkonce any case can > take the prevailing copy. That makes it much easier and everything folds into the prevailing case! Changed and added a test for it. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D36850/new/ https://reviews.llvm.org/D36850 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits