Re: [whopr] Design/implementation alternatives for the driver and WPA

Ian Lance Taylor Wed, 04 Jun 2008 13:46:07 -0700

Nick Kledzik <[EMAIL PROTECTED]> writes:

> On Jun 4, 2008, at 12:44 PM, Ian Lance Taylor wrote:
>> Chris Lattner <[EMAIL PROTECTED]> writes:
>>
>>>> * The return value of lto_module_get_symbol_attributes is not
>>>> defined.
>>>
>>> Ah, sorry about that.  Most of the details are actually in the public
>>> header.  The result of this function is a 'lto_symbol_attributes'
>>> bitmask.  This should be more useful and revealing:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/lto.h?revision=HEAD&view=markup
>>
>> From an ELF perspective, this doesn't seem to have a way to indicate a
>> common symbol, and it doesn't provide the symbol's type.
> The current lto interface does return whether a  symbol is
> REGULAR, TENTATIVE, WEAK_DEF, or UNDEFINED.  There is also
> CODE vs DATA which could be used to indicate STT_FUNC vs STT_OBJECT.


By "type" I mean STT_FUNC or STT_OBJECT.  I took CODE vs. DATA to
refer to the section in which the symbol is defined (SHF_EXECINSTR
vs. SHF_WRITE).  But, you're right, with appropriate squinting CODE
vs. DATA is probably adequate.


> I see you have your gold hat on here!  The current interface is
> simple and clean.  If it does turn out that repeated calls to
> lto_module_get_symbol*
> are really a bottleneck, we could add a "bulk" function.

I would like to add the bulk function now, because I know that we will
want it.


>>>> The LLVM
>>>> interface does not do that.
>>>
>>> Yes it does, the linker fully handles symbol resolution in our model.
>>>
>>>> Suppose the linker is invoked on a
>>>> sequence of object files, some with with LTO information, some
>>>> without, all interspersed.  Suppose some symbols are defined in
>>>> multiple .o files, through the use of common symbols, weak symbols,
>>>> and/or section groups.  The LLVM interface simply passes each object
>>>> file to the plugin.
>>>
>>> No, the native linker handles all the native .o files.
>>>
>>>> The result is that the plugin is required to do
>>>> symbol resolution itself.  This 1) loses one of the benefits of
>>>> having
>>>> the linker around; 2) will yield incorrect results when some non-LTO
>>>> object is linked in between LTO objects but redefines some earlier
>>>> weak symbol.
>>>
>>> In the LLVM LTO model, the plugin only needs to know about its .o
>>> files, and the linker uses this information to reason about symbol
>>> merging etc.  The Mac OS X linker can even do dead code stripping
>>> across Macho .o files and LLVM .bc files.
>>
>> To be clear, when I said object file here, I meant any input file.
>> You may have understood that.
>>
>> In ELF you have to think about symbol overriding.  Let's say you link
>> a.o b.o c.o.  a.o has a reference to symbol S.  b.o has a strong
>> definition.  c.o has a weak definition.  a.o and c.o have LTO
>> information, b.o does not.  ELF requires that a.o call the symbol from
>> b.o, not the symbol from c.o.  I don't see how to make that work with
>> the LLVM interface.
> This does work.  There are two parts to it.  First the linker's master
> symbol
> table sees the strong definition of S in b.o and the weak in c.o and
> decides to use the strong one from b.o.  Second (because of that) the
> linker
> calls  lto_codegen_add_must_preserve_symbol("S"). The LTO engine then
> sees it has a weak global function S and it cannot inline those.  Put
> together
> the LTO engine does generate a copy of S, but the linker throws it away
> and uses the one from b.o.

OK, for that case.  But are you asserting that this works in all
cases?  Should I come up with other examples of mixing LTO objects
with non-LTO objects using different types of symbols?

Ian

Re: [whopr] Design/implementation alternatives for the driver and WPA

Reply via email to