"Jose E. Marchesi" <jose.march...@oracle.com> writes: >> "Jose E. Marchesi" <jose.march...@oracle.com> writes: >>> Hi Richard. >>> Thanks for looking at this! :) >>> >>> >>>> "Jose E. Marchesi" <jose.march...@oracle.com> writes: >>>>> ping >>>> >>>> I don't know this code very well, and have AFAIR haven't worked >>>> with an assembler that requires external declarations, but since >>>> it's at a second ping :) >>>> >>>>> >>>>>> ping >>>>>> >>>>>>> [Differences from V1: >>>>>>> - Prototype for call_from_call_insn moved before comment block. >>>>>>> - Reuse the `call' flag for SYMBOL_REF_LIBCALL. >>>>>>> - Fallback to check REG_CALL_DECL in non-direct calls. >>>>>>> - New test to check correct behavior for non-direct calls.] >>>>>>> >>>>>>> There are many places in GCC where alternative local sequences are >>>>>>> tried in order to determine what is the cheapest or best alternative >>>>>>> to use in the current target. When any of these sequences involve a >>>>>>> libcall, the current implementation of emit_library_call_value_1 >>>>>>> introduce a side-effect consisting on emitting an external declaration >>>>>>> for the funcall (such as __divdi3) which is thus emitted even if the >>>>>>> sequence that does the libcall is not retained. >>>>>>> >>>>>>> This is problematic in targets such as BPF, because the kernel loader >>>>>>> chokes on the spurious symbol __divdi3 and makes the resulting BPF >>>>>>> object unloadable. Note that BPF objects are not linked before being >>>>>>> loaded. >>>>>>> >>>>>>> This patch changes emit_library_call_value_1 to mark the target >>>>>>> SYMBOL_REF as a libcall. Then, the emission of the external >>>>>>> declaration is done in the first loop of final.cc:shorten_branches. >>>>>>> This happens only if the corresponding sequence has been kept. >>>>>>> >>>>>>> Regtested in x86_64-linux-gnu. >>>>>>> Tested with host x86_64-linux-gnu with target bpf-unknown-none. >>>> >>>> I'm not sure that shorten_branches is a natural place to do this. >>>> It isn't something that would normally emit asm text. >>> >>> Well, that was the approach suggested by another reviewer (Jakub) once >>> my initial approach (in the V1) got rejected. He explicitly suggested >>> to use shorten_branches. >>> >>>> Would it be OK to emit the declaration at the same point as for decls, >>>> which IIUC is process_pending_assemble_externals? If so, how about >>>> making assemble_external_libcall add the symbol to a list when >>>> !SYMBOL_REF_USED, instead of calling targetm.asm_out.external_libcall >>>> directly? assemble_external_libcall could then also call get_identifier >>>> on the name (perhaps after calling strip_name_encoding -- can't >>>> remember whether assemble_external_libcall sees the encoded or >>>> unencoded name). >>>> >>>> All being well, the call to get_identifier should cause >>>> assemble_name_resolve to record when the name is used, via >>>> TREE_SYMBOL_REFERENCED. Then process_pending_assemble_externals could >>>> go through the list of libcalls recorded by assemble_external_libcall >>>> and check whether TREE_SYMBOL_REFERENCED is set on the get_identifier. >>>> >>>> Not super elegant, but it seems to fit within the existing scheme. >>>> And I don't there should be any problem with using get_identifier >>>> for libcalls, since it isn't valid to use libcall names for other >>>> types of symbol. >>> >>> This sounds way more complicated to me than the approach in V2, which >>> seems to work and is thus a clear improvement compared to the current >>> situation in the trunk. The approach in V2 may be ugly, but it is >>> simple and easy to understand. Is the proposed more convoluted >>> alternative really worth the extra complexity, given it is "not super >>> elegant"? >> >> Is it really that much more convoluted? I was thinking of something >> like the attached, which seems a bit shorter than V2, and does seem >> to fix the bpf tests. > > o_O > Ok I clearly misunderstood what you was proposing. This is way simpler! > > How does the magic of TREE_SYMBOL_REFERENCED work? How is it set to > `true' only if the RTL containing the call is retained in the final > chain?
It happens in assemble_name, via assemble_name_resolve. The system relies on code using that rather than assemble_name_raw for symbols that might need to be declared, or that might need visibility information attached. (It relies on that in general, I mean, not just for this patch.) Thanks, Richard