Nick Kledzik <[EMAIL PROTECTED]> writes:
> On Jun 4, 2008, at 12:44 PM, Ian Lance Taylor wrote:
>> Chris Lattner <[EMAIL PROTECTED]> writes:
>>
>>>> * The return value of lto_module_get_symbol_attributes is not
>>>> defined.
>>>
>>> Ah, sorry about that. Most of the details are actually in the public
>>> header. The result of this function is a 'lto_symbol_attributes'
>>> bitmask. This should be more useful and revealing:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/lto.h?revision=HEAD&view=markup
>>
>> From an ELF perspective, this doesn't seem to have a way to indicate a
>> common symbol, and it doesn't provide the symbol's type.
> The current lto interface does return whether a symbol is
> REGULAR, TENTATIVE, WEAK_DEF, or UNDEFINED. There is also
> CODE vs DATA which could be used to indicate STT_FUNC vs STT_OBJECT.
By "type" I mean STT_FUNC or STT_OBJECT. I took CODE vs. DATA to
refer to the section in which the symbol is defined (SHF_EXECINSTR
vs. SHF_WRITE). But, you're right, with appropriate squinting CODE
vs. DATA is probably adequate.
> I see you have your gold hat on here! The current interface is
> simple and clean. If it does turn out that repeated calls to
> lto_module_get_symbol*
> are really a bottleneck, we could add a "bulk" function.
I would like to add the bulk function now, because I know that we will
want it.
>>>> The LLVM
>>>> interface does not do that.
>>>
>>> Yes it does, the linker fully handles symbol resolution in our model.
>>>
>>>> Suppose the linker is invoked on a
>>>> sequence of object files, some with with LTO information, some
>>>> without, all interspersed. Suppose some symbols are defined in
>>>> multiple .o files, through the use of common symbols, weak symbols,
>>>> and/or section groups. The LLVM interface simply passes each object
>>>> file to the plugin.
>>>
>>> No, the native linker handles all the native .o files.
>>>
>>>> The result is that the plugin is required to do
>>>> symbol resolution itself. This 1) loses one of the benefits of
>>>> having
>>>> the linker around; 2) will yield incorrect results when some non-LTO
>>>> object is linked in between LTO objects but redefines some earlier
>>>> weak symbol.
>>>
>>> In the LLVM LTO model, the plugin only needs to know about its .o
>>> files, and the linker uses this information to reason about symbol
>>> merging etc. The Mac OS X linker can even do dead code stripping
>>> across Macho .o files and LLVM .bc files.
>>
>> To be clear, when I said object file here, I meant any input file.
>> You may have understood that.
>>
>> In ELF you have to think about symbol overriding. Let's say you link
>> a.o b.o c.o. a.o has a reference to symbol S. b.o has a strong
>> definition. c.o has a weak definition. a.o and c.o have LTO
>> information, b.o does not. ELF requires that a.o call the symbol from
>> b.o, not the symbol from c.o. I don't see how to make that work with
>> the LLVM interface.
> This does work. There are two parts to it. First the linker's master
> symbol
> table sees the strong definition of S in b.o and the weak in c.o and
> decides to use the strong one from b.o. Second (because of that) the
> linker
> calls lto_codegen_add_must_preserve_symbol("S"). The LTO engine then
> sees it has a weak global function S and it cannot inline those. Put
> together
> the LTO engine does generate a copy of S, but the linker throws it away
> and uses the one from b.o.
OK, for that case. But are you asserting that this works in all
cases? Should I come up with other examples of mixing LTO objects
with non-LTO objects using different types of symbols?
Ian