On Tue, 18 Feb 2014, Jan Hubicka wrote: > > > Non-ODR types born from other frontends will then need to be made to > > > alias all the ODR variants that can be done by storing them into the > > > current canonical type hash. > > > (I wonder if we want to support cross language aliasing for non-POD?) > > > > Surely for accessing components of non-POD types, no? Like > > > > class Foo { > > Foo(); > > int *get_data (); > > int *data; > > } glob_foo; > > > > extern "C" int *get_foo_data() { return glob_foo.get_data(); } > > OK, if we want to support this, then we want to merge. > What about types with vtbl pointer? :)
I can easily create a C struct variant covering that. Basically in _practice_ I can inter-operate with any language from C if I know its ABI. Do we really want to make this undefined? See the (even standard) Fortran - C interoperability spec. I'm sure something exists for Ada interoperating with C (or even C++). > > ? But you are talking about the "tree" merging part using ODR info > > to also merge types which differ in completeness of contained > > pointer types, right? (exactly equal cases should be already merged) > > Actually I was speaking of canonical types here. I want to preserve more > of TBAA via honoring ODR and local types. So, are you positive there will be a net gain in optimization when doing that? Please factor in the surprises you'll get when code gets "miscompiled" because of "slight" ODR violations or interoperability that no longer works. > I want to change lto to not > merge canonical types for pairs of types of same layout (i.e. equivalent > in the current canonical type definition) but with different mangled > names. Names are nothing ;) In C I very often see different _names_ used in headers vs. implementation (when the implementation uses a different internal header). You have struct Foo; in public headers vs. struct Foo_impl; in the implementation. > I also want it to never merge when types are local. For > inter-language TBAA we will need to ensure aliasing in between non-ODR > type of same layout and all unmerged variants of ODR type. > Can it be > done by attaching chains of ODR types into the canonical type hash and > when non-ODR type appears, just make it alias with all of them? No, how would that work? > It would make sense to ODR merge in tree merging, too, but I am not sure if > this fits the current design, since you would need to merge SCC components of > different shape then that seems hard, right? Right. You'd lose the nice incremental SCC merging (where we haven't even yet implemented the nicest way - avoid re-materializing the SCC until we know it prevails). > It may be easier to ODR merge after streaming (during DECL fixup) just to make > WPA streaming cheaper and to reduce debug info size. If you use > -fdump-ipa-devirt, it will dump you ODR types that did not get merged (only > ones with vtable pointers in them ATM) and there are quite long chains for > firefox. Surely then hundreds of duplicated ODR types will end up in the > ltrans > partition streams and they eventually hit debug output machinery. > Eric sent me presentation about doing this in LLVM. > http://llvm.org/devmtg/2013-11/slides/Christopher-DebugInfo.pdf Debuginfo is sth completely separate and should be done separately (early debug), avoiding to stream the types in the first place. > > > > The canonical type computation happens separately (only for prevailing > > types, of course), and there we already "merge" types which differ > > in completeness. Canonical type merging is conservative the other > > way aroud - if we merge _all_ types to a single canonical type then > > TBAA is still correct (we get a single alias set). > > Yes, I think I understand that. One equivalence is kind of minimal so we merge > only if we are sure there is no informationloss, other is maximal so we are > sure that types that needs to be equivalent by whatever underlying langauge > TBAA rules are actually equivalent. The former is just not correct - it would mean that not merging at all would be valid, which it is not (you'd create wrong-code all over the place). We still don't merge enough (because of latent bugs that I didn't manage to fix in time) - thus we do not merge all structurally equivalent types right now. > > > I also think we want explicit representation of types known to be local > > > to compilation unit - anonymous namespaces in C/C++, types defined > > > within function bodies in C and god knows what in Ada/Fortran/Java. > > > > But here you get into the idea of improving TBAA, thus having > > _more_ distinct canonical types? > > Yes. > > > > Just to make sure to not mix those two ;) > > > > And whatever "frontend knowledge" we want to excercise - please > > make sure we get a reliable way for the middle-end to see > > that "frontend knowledge" (no langhooks!). Thus, make it > > "middle-end knowledge". > > Sure that is what I am proposing - just have DECL_ASSEMBLER_NAME on TYPE_DECL > and ODR flag. Middle-end when comparing types will test ODR flag and if flag > is set, then it will compare via DECL_ASEBMLER_NAME (TYPE_DECL (type)). > No langhooks needed here + if other language has similar inter-unit > equivalency > it can use the same mechanizm. Just turn the equivalency description into > string identifiers. Ok. You have to be aware of the effects on inter-language interoperability though (you'll break it). Thus I'd make this guarded by -fextra-strict-aliasing and only auto-enable that when all TUs are produced by the same frontend (easy enough to check I guess). Richard. > > Oh - and the easiest way to improve things is to get less types into > > the merging process in the first place! > > Yep, my experiments with not streaming BINFO are directed in it. I will > collect > some numbers and send. > > Honza > > > > Richard.