> > Non-ODR types born from other frontends will then need to be made to > > alias all the ODR variants that can be done by storing them into the > > current canonical type hash. > > (I wonder if we want to support cross language aliasing for non-POD?) > > Surely for accessing components of non-POD types, no? Like > > class Foo { > Foo(); > int *get_data (); > int *data; > } glob_foo; > > extern "C" int *get_foo_data() { return glob_foo.get_data(); }
OK, if we want to support this, then we want to merge. What about types with vtbl pointer? :) > > ? But you are talking about the "tree" merging part using ODR info > to also merge types which differ in completeness of contained > pointer types, right? (exactly equal cases should be already merged) Actually I was speaking of canonical types here. I want to preserve more of TBAA via honoring ODR and local types. I want to change lto to not merge canonical types for pairs of types of same layout (i.e. equivalent in the current canonical type definition) but with different mangled names. I also want it to never merge when types are local. For inter-language TBAA we will need to ensure aliasing in between non-ODR type of same layout and all unmerged variants of ODR type. Can it be done by attaching chains of ODR types into the canonical type hash and when non-ODR type appears, just make it alias with all of them? It would make sense to ODR merge in tree merging, too, but I am not sure if this fits the current design, since you would need to merge SCC components of different shape then that seems hard, right? It may be easier to ODR merge after streaming (during DECL fixup) just to make WPA streaming cheaper and to reduce debug info size. If you use -fdump-ipa-devirt, it will dump you ODR types that did not get merged (only ones with vtable pointers in them ATM) and there are quite long chains for firefox. Surely then hundreds of duplicated ODR types will end up in the ltrans partition streams and they eventually hit debug output machinery. Eric sent me presentation about doing this in LLVM. http://llvm.org/devmtg/2013-11/slides/Christopher-DebugInfo.pdf > > The canonical type computation happens separately (only for prevailing > types, of course), and there we already "merge" types which differ > in completeness. Canonical type merging is conservative the other > way aroud - if we merge _all_ types to a single canonical type then > TBAA is still correct (we get a single alias set). Yes, I think I understand that. One equivalence is kind of minimal so we merge only if we are sure there is no informationloss, other is maximal so we are sure that types that needs to be equivalent by whatever underlying langauge TBAA rules are actually equivalent. > > > I also think we want explicit representation of types known to be local > > to compilation unit - anonymous namespaces in C/C++, types defined > > within function bodies in C and god knows what in Ada/Fortran/Java. > > But here you get into the idea of improving TBAA, thus having > _more_ distinct canonical types? Yes. > > Just to make sure to not mix those two ;) > > And whatever "frontend knowledge" we want to excercise - please > make sure we get a reliable way for the middle-end to see > that "frontend knowledge" (no langhooks!). Thus, make it > "middle-end knowledge". Sure that is what I am proposing - just have DECL_ASSEMBLER_NAME on TYPE_DECL and ODR flag. Middle-end when comparing types will test ODR flag and if flag is set, then it will compare via DECL_ASEBMLER_NAME (TYPE_DECL (type)). No langhooks needed here + if other language has similar inter-unit equivalency it can use the same mechanizm. Just turn the equivalency description into string identifiers. > > Oh - and the easiest way to improve things is to get less types into > the merging process in the first place! Yep, my experiments with not streaming BINFO are directed in it. I will collect some numbers and send. Honza > > Richard.