On Wed, Apr 17, 2024 at 03:26:53PM +0200, Jan Hubicka wrote:
> > 
> > I've tried to see what actually happens during linking without LTO, so 
> > compiled
> > pr113208_0.C with -O1 -fkeep-inline-functions -std=c++20 with vanilla trunk
> > (so it has those 2 separate comdats, one for C2 and one for C1), though I've
> > changed the
> > void m(k);
> > line to
> > __attribute__((noipa)) void m(k) {}
> > in the testcase, then compiled
> > pr113208_1.C with -O2 -fkeep-inline-functions -std=c++20 
> > -fno-omit-frame-pointer
> > so that one can clearly differentiate from where the implementation was
> > picked and finally added
> > template <typename _Tp> struct _Vector_base {
> >   int g() const;
> >   _Vector_base(int, int);
> > };
> > 
> > struct QualityValue;
> > template <>
> > _Vector_base<QualityValue>::_Vector_base(int, int) {}
> > template <>
> > int _Vector_base<QualityValue>::g() const { return 0; }
> > int main () {}
> > If I link this, I see _ZN6vectorI12QualityValueEC2ERKS1_ and
> > _ZN6vectorI12QualityValueEC1ERKS1_ as separate functions with the
> > omitted frame pointer bodies, so clearly the pr113208_0.C versions prevailed
> > in both cases.  It is unclear why that isn't the case for LTO.
> 
> I think it is because of -fkeep-inline-functions which makes the first
> object file to define both symbols, while with LTO we optimize out one
> of them.  
> 
> So to reproduce same behaviour with non-LTO we would probably need use
> -O1 and arrange the contructor to be unilinable instead of using
> -fkeep-inline-functions.

Ah, you're right.
If I compile (the one line modified) pr113208_0.C with
-O -fno-early-inlining -fdisable-ipa-inline -std=c++20
it does have just _ZN6vectorI12QualityValueEC2ERKS1_ in 
_ZN6vectorI12QualityValueEC2ERKS1_
comdat and no _ZN6vectorI12QualityValueEC1ERKS1_
and pr113208_1.C with -O -fno-early-inlining -fdisable-ipa-inline -std=c++20 
-fno-omit-frame-pointer
and link that together with the above mentioned third *.C file, I see
000000000040112a <_ZN6vectorI12QualityValueEC2ERKS1_>:
  40112a:       53                      push   %rbx
  40112b:       48 89 fb                mov    %rdi,%rbx
  40112e:       48 89 f7                mov    %rsi,%rdi
  401131:       e8 9c 00 00 00          call   4011d2 
<_ZNK12_Vector_baseI12QualityValueE1gEv>
  401136:       89 c2                   mov    %eax,%edx
  401138:       be 01 00 00 00          mov    $0x1,%esi
  40113d:       48 89 df                mov    %rbx,%rdi
  401140:       e8 7b 00 00 00          call   4011c0 
<_ZN12_Vector_baseI12QualityValueEC1Eii>
  401145:       5b                      pop    %rbx
  401146:       c3                      ret    
i.e. the C2 prevailing from pr113208_0.s where it is the only symbol, and
0000000000401196 <_ZN6vectorI12QualityValueEC1ERKS1_>:
  401196:       55                      push   %rbp
  401197:       48 89 e5                mov    %rsp,%rbp
  40119a:       53                      push   %rbx
  40119b:       48 83 ec 08             sub    $0x8,%rsp
  40119f:       48 89 fb                mov    %rdi,%rbx
  4011a2:       48 89 f7                mov    %rsi,%rdi
  4011a5:       e8 28 00 00 00          call   4011d2 
<_ZNK12_Vector_baseI12QualityValueE1gEv>
  4011aa:       89 c2                   mov    %eax,%edx
  4011ac:       be 01 00 00 00          mov    $0x1,%esi
  4011b1:       48 89 df                mov    %rbx,%rdi
  4011b4:       e8 07 00 00 00          call   4011c0 
<_ZN12_Vector_baseI12QualityValueEC1Eii>
  4011b9:       48 8b 5d f8             mov    -0x8(%rbp),%rbx
  4011bd:       c9                      leave  
  4011be:       c3                      ret    
which is the C1 alias originally aliased to C2 in C5 comdat.
So, that would match linker behavior where it sees C1 -> C2 alias prevails,
but a different version of C2 prevails, so let's either make C1 a non-alias
or alias to a non-exported symbol or something like that.
Though, I admit I have no idea what we do with comdat's during LTO, perhaps
doing what I said above could break stuff if linker after seeing the LTO
resulting objects decides on prevailing symbols differently.

        Jakub

Reply via email to