Hi Jakub.
> On Thu, Dec 08, 2022 at 02:02:36PM +0100, Jose E. Marchesi wrote: >> So, I guess the right fix would be to call assemble_external_libcall >> during final? The `.global FOO' directive would be generated >> immediately before the call sequence, but I guess that would be ok. > > During final only if all the targets can deal with the effects of > assemble_external_libcall being done in the middle of emitting assembly > for the function. > > Otherwise, it could be e.g. done in the first loop of shorten_branches. > > Note, in calls.cc it is done only for emit_library_call_value_1 > and not for emit_call_1, so if we do it late, we need to be able to find > out what call is to a libcall and what is to a normal call. If there is > no way to differentiate it right now, perhaps we need some flag somewhere, > say on a SYMBOL_REF. And then assemble_external_libcall either only > if such a SYMBOL_REF appears in CALL_INSN or sibcall JUMP_INSN, or > perhaps anywhere in the function and its constant pool. Allright, the quick-and-dirty patch below seems to DTRT with simple examples. First, when libcalls are generated. Note only one .global is generated for all calls, and actually it is around the same position than before: $ cat foo.c int foo(unsigned int len, int flag) { if (flag) return (((long)len) * 234 / 5); return (((long)len) * 2 / 5); } $ cc1 -O2 foo.c $ cat foo.c .file "foo.c" .text .global __divdi3 .align 3 .global foo .type foo, @function foo: mov32 %r1,%r1 lsh %r2,32 jne %r2,0,.L5 mov %r2,5 lsh %r1,1 call __divdi3 lsh %r0,32 arsh %r0,32 exit .L5: mov %r2,5 mul %r1,234 call __divdi3 lsh %r0,32 arsh %r0,32 exit .size foo, .-foo .ident "GCC: (GNU) 13.0.0 20221207 (experimental)" Second, when libcalls are tried by expand_moddiv in a sequence, but then discarded and not linked in the main sequence: $ cat foo.c int foo(unsigned int len, int flag) { if (flag) return (((long)len) * 234 / 5); return (((long)len) * 2 / 5); } $ cc1 -O2 foo.c $ cat foo.c .file "foo.c" .text .align 3 .global foo .type foo, @function foo: mov32 %r0,%r1 lsh %r2,32 jne %r2,0,.L5 add %r0,%r0 div %r0,5 lsh %r0,32 arsh %r0,32 exit .L5: mul %r0,234 div %r0,5 lsh %r0,32 arsh %r0,32 exit .size foo, .-foo .ident "GCC: (GNU) 13.0.0 20221207 (experimental)" Note the .global now is not generated, as desired. As you can see below, I am adding a new RTX flag `is_libcall', with written form "/l". Before I get into serious testing etc, can you please confirm whether this is the right approach or not? In particular, I am a little bit concerned about the expectation I am using that the target of the `call' instruction emitted by emit_call_1 is always a (MEM (SYMBOL_REF ...)) when it is passed a SYMBOL_REF as the first argument (`fun' in emit_library_call_value_1). Thanks. diff --git a/gcc/calls.cc b/gcc/calls.cc index 6dd6f73e978..6c4a3725272 100644 --- a/gcc/calls.cc +++ b/gcc/calls.cc @@ -4370,10 +4370,6 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value, || argvec[i].partial != 0) update_stack_alignment_for_call (&argvec[i].locate); - /* If this machine requires an external definition for library - functions, write one out. */ - assemble_external_libcall (fun); - original_args_size = args_size; args_size.constant = (aligned_upper_bound (args_size.constant + stack_pointer_delta, @@ -4717,6 +4713,9 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value, valreg, old_inhibit_defer_pop + 1, call_fusage, flags, args_so_far); + /* Mark the emitted call as a libcall with the new flag. */ + RTL_LIBCALL_P (last_call_insn ()) = 1; + if (flag_ipa_ra) { rtx datum = orgfun; diff --git a/gcc/final.cc b/gcc/final.cc index eea572238f6..df57de5afd0 100644 --- a/gcc/final.cc +++ b/gcc/final.cc @@ -815,6 +815,8 @@ make_pass_compute_alignments (gcc::context *ctxt) reorg.cc, since the branch splitting exposes new instructions with delay slots. */ +static rtx call_from_call_insn (rtx_call_insn *insn); + void shorten_branches (rtx_insn *first) { @@ -850,6 +852,24 @@ shorten_branches (rtx_insn *first) for (insn = get_insns (), i = 1; insn; insn = NEXT_INSN (insn)) { INSN_SHUID (insn) = i++; + + /* If this is a `call' instruction that implements a libcall, + and this machine requires an external definition for library + functions, write one out. */ + if (CALL_P (insn) && RTL_LIBCALL_P (insn)) + { + rtx nested_call = call_from_call_insn (safe_as_a <rtx_call_insn*> (insn)); + rtx mem = XEXP (nested_call, 0); + gcc_assert (GET_CODE (mem) == MEM); + rtx fun = XEXP (mem, 0); + gcc_assert (GET_CODE (fun) == SYMBOL_REF); + assemble_external_libcall (fun); + + /* Clear the LIBCALL flag to make sure we don't assemble the + external definition more than once. */ + RTL_LIBCALL_P (insn) = 0; + } + if (INSN_P (insn)) continue; diff --git a/gcc/print-rtl.cc b/gcc/print-rtl.cc index e115f987173..26a06511619 100644 --- a/gcc/print-rtl.cc +++ b/gcc/print-rtl.cc @@ -882,6 +882,9 @@ rtx_writer::print_rtx (const_rtx in_rtx) if (RTX_FLAG (in_rtx, return_val)) fputs ("/i", m_outfile); + if (RTX_FLAG (in_rtx, is_libcall)) + fputs ("/l", m_outfile); + /* Print REG_NOTE names for EXPR_LIST and INSN_LIST. */ if ((GET_CODE (in_rtx) == EXPR_LIST || GET_CODE (in_rtx) == INSN_LIST diff --git a/gcc/read-rtl.cc b/gcc/read-rtl.cc index 62c7895af60..eb0ae150a5b 100644 --- a/gcc/read-rtl.cc +++ b/gcc/read-rtl.cc @@ -1543,6 +1543,9 @@ read_flags (rtx return_rtx) case 'i': RTX_FLAG (return_rtx, return_val) = 1; break; + case 'l': + RTX_FLAG (return_rtx, is_libcall) = 1; + break; default: fatal_with_file_and_line ("unrecognized flag: `%c'", flag_char); } diff --git a/gcc/rtl.h b/gcc/rtl.h index 7a8c4709257..92c802d3876 100644 --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -401,6 +401,10 @@ struct GTY((desc("0"), tag("0"), Dumped as "/i" in RTL dumps. */ unsigned return_val : 1; + /* 1 in a CALL_INSN if it is a libcall. + Dumped as "/l" in RTL dumps. */ + unsigned is_libcall : 1; + union { /* The final union field is aligned to 64 bits on LP64 hosts, giving a 32-bit gap after the fields above. We optimize the @@ -1578,6 +1582,10 @@ jump_table_for_label (const rtx_code_label *label) #define RTL_PURE_CALL_P(RTX) \ (RTL_FLAG_CHECK1 ("RTL_PURE_CALL_P", (RTX), CALL_INSN)->return_val) +/* 1 if RTX is a libcall. */ +#define RTL_LIBCALL_P(RTX) \ + (RTL_FLAG_CHECK1 ("RTL_LIBCALL_P", (RTX), CALL_INSN)->is_libcall) + /* 1 if RTX is a call to a const or pure function. */ #define RTL_CONST_OR_PURE_CALL_P(RTX) \ (RTL_CONST_CALL_P (RTX) || RTL_PURE_CALL_P (RTX))