On Wed, Apr 17, 2024 at 03:26:53PM +0200, Jan Hubicka wrote: > > > > I've tried to see what actually happens during linking without LTO, so > > compiled > > pr113208_0.C with -O1 -fkeep-inline-functions -std=c++20 with vanilla trunk > > (so it has those 2 separate comdats, one for C2 and one for C1), though I've > > changed the > > void m(k); > > line to > > __attribute__((noipa)) void m(k) {} > > in the testcase, then compiled > > pr113208_1.C with -O2 -fkeep-inline-functions -std=c++20 > > -fno-omit-frame-pointer > > so that one can clearly differentiate from where the implementation was > > picked and finally added > > template <typename _Tp> struct _Vector_base { > > int g() const; > > _Vector_base(int, int); > > }; > > > > struct QualityValue; > > template <> > > _Vector_base<QualityValue>::_Vector_base(int, int) {} > > template <> > > int _Vector_base<QualityValue>::g() const { return 0; } > > int main () {} > > If I link this, I see _ZN6vectorI12QualityValueEC2ERKS1_ and > > _ZN6vectorI12QualityValueEC1ERKS1_ as separate functions with the > > omitted frame pointer bodies, so clearly the pr113208_0.C versions prevailed > > in both cases. It is unclear why that isn't the case for LTO. > > I think it is because of -fkeep-inline-functions which makes the first > object file to define both symbols, while with LTO we optimize out one > of them. > > So to reproduce same behaviour with non-LTO we would probably need use > -O1 and arrange the contructor to be unilinable instead of using > -fkeep-inline-functions.
Ah, you're right. If I compile (the one line modified) pr113208_0.C with -O -fno-early-inlining -fdisable-ipa-inline -std=c++20 it does have just _ZN6vectorI12QualityValueEC2ERKS1_ in _ZN6vectorI12QualityValueEC2ERKS1_ comdat and no _ZN6vectorI12QualityValueEC1ERKS1_ and pr113208_1.C with -O -fno-early-inlining -fdisable-ipa-inline -std=c++20 -fno-omit-frame-pointer and link that together with the above mentioned third *.C file, I see 000000000040112a <_ZN6vectorI12QualityValueEC2ERKS1_>: 40112a: 53 push %rbx 40112b: 48 89 fb mov %rdi,%rbx 40112e: 48 89 f7 mov %rsi,%rdi 401131: e8 9c 00 00 00 call 4011d2 <_ZNK12_Vector_baseI12QualityValueE1gEv> 401136: 89 c2 mov %eax,%edx 401138: be 01 00 00 00 mov $0x1,%esi 40113d: 48 89 df mov %rbx,%rdi 401140: e8 7b 00 00 00 call 4011c0 <_ZN12_Vector_baseI12QualityValueEC1Eii> 401145: 5b pop %rbx 401146: c3 ret i.e. the C2 prevailing from pr113208_0.s where it is the only symbol, and 0000000000401196 <_ZN6vectorI12QualityValueEC1ERKS1_>: 401196: 55 push %rbp 401197: 48 89 e5 mov %rsp,%rbp 40119a: 53 push %rbx 40119b: 48 83 ec 08 sub $0x8,%rsp 40119f: 48 89 fb mov %rdi,%rbx 4011a2: 48 89 f7 mov %rsi,%rdi 4011a5: e8 28 00 00 00 call 4011d2 <_ZNK12_Vector_baseI12QualityValueE1gEv> 4011aa: 89 c2 mov %eax,%edx 4011ac: be 01 00 00 00 mov $0x1,%esi 4011b1: 48 89 df mov %rbx,%rdi 4011b4: e8 07 00 00 00 call 4011c0 <_ZN12_Vector_baseI12QualityValueEC1Eii> 4011b9: 48 8b 5d f8 mov -0x8(%rbp),%rbx 4011bd: c9 leave 4011be: c3 ret which is the C1 alias originally aliased to C2 in C5 comdat. So, that would match linker behavior where it sees C1 -> C2 alias prevails, but a different version of C2 prevails, so let's either make C1 a non-alias or alias to a non-exported symbol or something like that. Though, I admit I have no idea what we do with comdat's during LTO, perhaps doing what I said above could break stuff if linker after seeing the LTO resulting objects decides on prevailing symbols differently. Jakub