"Jose E. Marchesi" <jose.march...@oracle.com> writes:
>> "Jose E. Marchesi" <jose.march...@oracle.com> writes:
>>> Hi Richard.
>>> Thanks for looking at this! :)
>>>
>>>
>>>> "Jose E. Marchesi" <jose.march...@oracle.com> writes:
>>>>> ping
>>>>
>>>> I don't know this code very well, and have AFAIR haven't worked
>>>> with an assembler that requires external declarations, but since
>>>> it's at a second ping :)
>>>>
>>>>>
>>>>>> ping
>>>>>>
>>>>>>> [Differences from V1:
>>>>>>> - Prototype for call_from_call_insn moved before comment block.
>>>>>>> - Reuse the `call' flag for SYMBOL_REF_LIBCALL.
>>>>>>> - Fallback to check REG_CALL_DECL in non-direct calls.
>>>>>>> - New test to check correct behavior for non-direct calls.]
>>>>>>>
>>>>>>> There are many places in GCC where alternative local sequences are
>>>>>>> tried in order to determine what is the cheapest or best alternative
>>>>>>> to use in the current target.  When any of these sequences involve a
>>>>>>> libcall, the current implementation of emit_library_call_value_1
>>>>>>> introduce a side-effect consisting on emitting an external declaration
>>>>>>> for the funcall (such as __divdi3) which is thus emitted even if the
>>>>>>> sequence that does the libcall is not retained.
>>>>>>>
>>>>>>> This is problematic in targets such as BPF, because the kernel loader
>>>>>>> chokes on the spurious symbol __divdi3 and makes the resulting BPF
>>>>>>> object unloadable.  Note that BPF objects are not linked before being
>>>>>>> loaded.
>>>>>>>
>>>>>>> This patch changes emit_library_call_value_1 to mark the target
>>>>>>> SYMBOL_REF as a libcall.  Then, the emission of the external
>>>>>>> declaration is done in the first loop of final.cc:shorten_branches.
>>>>>>> This happens only if the corresponding sequence has been kept.
>>>>>>>
>>>>>>> Regtested in x86_64-linux-gnu.
>>>>>>> Tested with host x86_64-linux-gnu with target bpf-unknown-none.
>>>>
>>>> I'm not sure that shorten_branches is a natural place to do this.
>>>> It isn't something that would normally emit asm text.
>>>
>>> Well, that was the approach suggested by another reviewer (Jakub) once
>>> my initial approach (in the V1) got rejected.  He explicitly suggested
>>> to use shorten_branches.
>>>
>>>> Would it be OK to emit the declaration at the same point as for decls,
>>>> which IIUC is process_pending_assemble_externals?  If so, how about
>>>> making assemble_external_libcall add the symbol to a list when
>>>> !SYMBOL_REF_USED, instead of calling targetm.asm_out.external_libcall
>>>> directly?  assemble_external_libcall could then also call get_identifier
>>>> on the name (perhaps after calling strip_name_encoding -- can't
>>>> remember whether assemble_external_libcall sees the encoded or
>>>> unencoded name).
>>>>
>>>> All being well, the call to get_identifier should cause
>>>> assemble_name_resolve to record when the name is used, via
>>>> TREE_SYMBOL_REFERENCED.  Then process_pending_assemble_externals could
>>>> go through the list of libcalls recorded by assemble_external_libcall
>>>> and check whether TREE_SYMBOL_REFERENCED is set on the get_identifier.
>>>>
>>>> Not super elegant, but it seems to fit within the existing scheme.
>>>> And I don't there should be any problem with using get_identifier
>>>> for libcalls, since it isn't valid to use libcall names for other
>>>> types of symbol.
>>>
>>> This sounds way more complicated to me than the approach in V2, which
>>> seems to work and is thus a clear improvement compared to the current
>>> situation in the trunk.  The approach in V2 may be ugly, but it is
>>> simple and easy to understand.  Is the proposed more convoluted
>>> alternative really worth the extra complexity, given it is "not super
>>> elegant"?
>>
>> Is it really that much more convoluted?  I was thinking of something
>> like the attached, which seems a bit shorter than V2, and does seem
>> to fix the bpf tests.
>
> o_O
> Ok I clearly misunderstood what you was proposing.  This is way simpler!
>
> How does the magic of TREE_SYMBOL_REFERENCED work?  How is it set to
> `true' only if the RTL containing the call is retained in the final
> chain?

It happens in assemble_name, via assemble_name_resolve.  The system
relies on code using that rather than assemble_name_raw for symbols
that might need to be declared, or that might need visibility
information attached.  (It relies on that in general, I mean,
not just for this patch.)

Thanks,
Richard

Reply via email to