https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62306
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> --- I don't remember, but IRC log does something: IRC #gcc on oftc Dec 1 2009 <jakub> jason: so, I have a patch to handle the implicit cdtors in extern template class by exporting the cdtors at the explicit instantiation side and never emit them at the side with just extern template class <jakub> jason: which cures the dtor either D[012], or none of them check for virtual dtors <jason> I would only do that for virtual dtors <jakub> jason: unfortunately it is probably an incompatible ABI change, given that we weren't previously exporting them all at the point of explicit instantiation <jason> mm <jason> true <jakub> jason: with the patch a few libstdc++ tests fail, because _ZNSt15basic_stringbufIcSt11char_traitsIcESaIcEED2Ev is not exported from libstdc++.so.6 <jason> that's with an old libstdc++? <jakub> jason: while we could export it now, similar issue could be with other C++ shared libraries - compiled with older g++ they wouldn't export some dtors that new g++ compiled code would now expect in it <jason> right <jakub> jason: that is with new libstdc++, which has now those dtors emitted, but hidden by the linker script <jason> so, I don't think that's worth doing <jakub> jason: as I wrote, in the libstdc++ case we could export them at GLIBCXX_3.4.14 <jakub> jason: so, can I change the patch not to emit *D0* deleting dtor in the *D5* group then? <jakub> jason: or do you see other options? <jason> what's the problem with just emitting them as needed like we do now? <jakub> jason: if the *D5* comdat group is emitted sometimes with all of *D[012]* and sometimes with just *D[12]* (and perhaps sometimes with just *D0* symbols in it <richi> emit them as needed outside of any comdat group? <jakub> jason: then the linker will pick just one of the *D5* comdat groups, and depending on which symbols that group has, it will either allow stuff to be linked or not <jason> the D5 comdat group should always be emitted with all of D[012]. <jason> that's separate from what we export from libstdc++ <jason> (D[012] or D[12] depending on whether or not D0 exists) <jakub> jason: if it is a requirement that we always emit all of D[012] if dtor is virtual, then we'll just emit bloat <jason> why? <jakub> jason: because we don't need the D0 anywhere <jason> I see <jakub> jason: say we have libfoo.so which contains extern template class S<5>; template class S<5>; where S has trivial virtual dtor <jakub> jason: that library will likely have D[012] emitted <jakub> jason: then in some other DSO we just use extern template class S<5>;, current g++ (unfortunately) will emit D[12] when those are referenced by the code <jakub> jason: if we need D[012] in D5 comdat group, it would mean we have to also emit *D0* in that other DSO <jason> but D0 is just two calls <jason> and we won't emit anything if inlining is on <jason> oddly, the ABI doesn't seem to say anything about explicit instantiation <jakub> jason: true, D0 is just two calls <jakub> jason: it is not true that we don't emit anything if inlining is on, the inliner apparently doesn't inline in many cases when it thinks the caller is cold <jason> ah <jakub> jason: plus, I have no idea how to teach cgraph etc. that it should either emit all of D[012], or none of them <jakub> jason: this is quite different case than the same body aliases, in this case it is two different functions <jakub> jason: that's going to be quite non-trivial especially on the lto side :( <jason> My thinking was that the ABI change should also support implementations that implement D0 as another entry point into the destructor <jakub> jason: I understand that reason (though I think the 2 calls solution is far more likely, as D0 is really D1 plus delete) <jakub> jason: I'm just saying that it has a lot of issues <jakub> jason: perhaps we could avoid doing the same body optimization (== use D5 group at all) when 1) DECL_ONE_ONLY 2) the virtual dtor is trivial 3) extern template class was seen <jakub> jason: at least as a short term solution, to at least optimize all the other cases <jason> jakub: I suppose that's ok as a short term solution <jason> s/trivial/implicitly defined/ <jason> what do we do for user-defined inline [cd]tors? <jakub> jason: those are either inlined successfully, or external symbol references <jason> so we handle synthesized methods differently from user inlines? oops <jason> they ought to work the same <jakub> jason: the problem is that do_type_instantiation ignores FUNCTION_DECLs with !DECL_TEMPLATE_INSTANTIATION and that's the case for artificial methods <jakub> jason: let me post the patch <jakub> jason: pt.c patch with description posted <jakub> jason: is there a way to check for the extern template class seen? <jason> DECL_EXPLICIT_INSTANTIATION && DECL_REALLY_EXTERN <jason> or CLASSTYPE_ <jakub> jason: these synthetized methods never have any DECL_TEMPLATE_USE non-zero <jason> right, would need to check CLASSTYPE_ <jakub> jason: CLASSTYPE_EXPLICIT_INSTANTIATION (DECL_CONTEXT ()) will also trigger for just template class XXX<NNN>; without any extern <jakub> jason: though perhaps it is valid to use just template class XXX<NNN>; on one side and extern template class XXX<NNN>; ont eh other side <jason> ah, right, you want CLASSTYPE_INTERFACE_ONLY as well <jakub> actually, probably not, because then in the same library we would mix D5 comdat group and D0/D1/D2 groups <jakub> jason: so, do you think: <jakub> @@ -291,7 +291,16 @@ maybe_clone_body (tree fn) <jakub> && DECL_INTERFACE_KNOWN (fns[0]) <jakub> && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fns[0])) <jakub> && (!DECL_ONE_ONLY (fns[0]) <jakub> - || (HAVE_COMDAT_GROUP && DECL_WEAK (fns[0]))) <jakub> + || (HAVE_COMDAT_GROUP <jakub> + && DECL_WEAK (fns[0]) <jakub> + /* Don't optimize synthetized virtual dtors of <jakub> + explicitly instantiated classes, because then <jakub> + the deleting and other dtors are emitted when needed <jakub> + and so it is not certain we would emit both <jakub> + deleting and complete/base dtors in the comdat group. */ <jakub> + && (fns[2] == NULL <jakub> + || !CLASSTYPE_EXPLICIT_INSTANTIATION (DECL_CONTEXT (fn)) <jakub> + || !DECL_ARTIFICIAL (fn)))) <jakub> && cgraph_same_body_alias (clone, fns[0])) <jakub> { <jakub> alias = true; <jakub> would be ok? It works on the cplx.ii testcase at least <jason> jakub: hmm...if we have a 'template class ...' without an 'extern template class...' I think that would mean emitting D5 in one .o and D[12] comdats in another <jakub> jason: I believe we'll use D[012] comdats in both cases, let me check that patch <jason> I mean if one object has and explicit instation, and another has none <jason> s/and/an/ <jakub> jason: other has implicit one you mean? <jason> right <jakub> jason: true :( <jakub> jason: so that would mean not optimizing any synthetized virtual dtors, right? <jakub> jason: perhaps they aren't so common <jason> or accepting the bloat, or leaving D0 out of D5 <jakub> jason: accepting the bloat means spending a few days in cgraph trying to teach it to do this, so nothing I'd love to do right now <jakub> jason: leaving D0 out of D5 would be easiest, but would perhaps be a problem for other implementations <jakub> - && !DECL_ONE_ONLY (fns[0]) <jakub> + && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fns[0])) <jakub> + && (!DECL_ONE_ONLY (fns[0]) <jakub> + || (HAVE_COMDAT_GROUP <jakub> + && DECL_WEAK (fns[0]) <jakub> + /* Don't optimize synthetized virtual dtors, because then <jakub> + the deleting and other dtors are emitted when needed <jakub> + and so it is not certain we would emit both <jakub> + deleting and complete/base dtors in the comdat group. */ <jakub> + && (fns[2] == NULL || !DECL_ARTIFICIAL (fn)))) <jakub> jason: I guess both "accepting the bloat" and leaving D0 out of D5 are irreversible decisions <jason> right <jakub> jason: while this one would allow us to think about it some more <jason> yep <jason> so let's go ahead with this one for now <jakub> jason: ok, I'll test that right now (with the checking patch applied on top of it of course) <jakub> jason: perhaps this change will also make the gnu.ver changes unnecessary, let me check that <jakub> jason: libstdc++.so.6 size grows with that change compared to the posted patch by 128 bytes, that would be acceptable <jakub> jason: and gnu.ver change is really unnecessary in that case <drow> It may be masked for GCC because we use weak comdat symbols. <drow> Why do we even do that? <drow> jakub: Interestingly, armcc just puts both in the C1 comdat group. <drow> AREA ||i._ZN1XC1Ev||, COMGROUP=_ZN1XC1Ev, CODE, READONLY, ALIGN=2 <drow> _ZN1XC2Ev ; Alternate entry point <drow> _ZN1XC1Ev PROC <drow> And I can produce all sorts of COMDAT-related link errors without any trouble :-( * drow gets it now, switches back to email. <drow> Thanks. <jakub> drow: the [CD]5 comdat group names was something that was discussed on cxx-abi ml <drow> Oh. Well then. <drow> I should obviously shut up as you're still several steps ahead of me. <jakub> drow: it was Jason who discussed it there; anyway, if the mixing of .o files would be very common, it might be a reason to consider some linker extension to allow to say that this SHT_GROUP obsoletes a list of other SHT_GROUP comdat groups <jakub> drow: but I believe the usual case is that the whole shared library or executable is compiled with the same compiler <drow> Yeah. I'm reading the cxx-abi-dev archives to see how this fits together and the consensus seems to be 'it's wildly broken' :-) <drow> We have a case to consider that I suspect icc does too; GCC used to build libstdc++ and some system libraries, realview or icc used to build the application. But I don't expect a lot of this sort of problem from that scenario <jakub> drow: sure, this should have been talked about many years ago <jakub> drow: well, if you build shared libraries with one compiler and binary with another one, it is not a problem, only if you link *.a libs into the binary or shared libraries <jakub> drow: that happens from time to time as well, but icc has always an option to do the same as gcc (perhaps under some option) and even if it doesn't, it is just some bloat <jakub> drow: emitting the complete and base dtors or ctors as aliases doesn't only save .text space, but e.g. allows you to say store &&label in ctor or dtor static variables <jakub> which is not possible if the ctor or dtor has two different bodies, but the static vars in it are obviously shared <drow> jakub: Dare I ask... what happens if a constructor has a static var, anyway? Do the in-charge and not-in-charge versions get the same variable? I don't remember the ABI mandating naming.