On Thu, Feb 13, 2025 at 5:31 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > On Thu, Feb 13, 2025 at 1:58 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > x86 conditional branch (jcc) target can be either a label or a symbol. > > Add a pass to fold tail call with jcc by turning: > > > > jcc .L6 > > ... > > .L6: > > jmp tailcall > > > > into: > > > > jcc tailcall > > > > After basic block reordering pass, conditional branches look like > > > > (jump_insn 7 6 14 2 (set (pc) > > (if_then_else (eq (reg:CCZ 17 flags) > > (const_int 0 [0])) > > (label_ref:DI 23) > > (pc))) "x.c":8:5 1458 {jcc} > > (expr_list:REG_DEAD (reg:CCZ 17 flags) > > (int_list:REG_BR_PROB 217325348 (nil))) > > ... > > (code_label 23 20 8 4 4 (nil) [1 uses]) > > (note 8 23 9 4 [bb 4] NOTE_INSN_BASIC_BLOCK) > > (call_insn/j 9 8 10 4 (call (mem:QI (symbol_ref:DI ("bar") [flags 0x41] > > <functi > > on_decl 0x7f4cff3c0b00 bar>) [0 bar S1 A8]) > > (const_int 0 [0])) "x.c":8:14 discrim 1 1469 {sibcall_di} > > (expr_list:REG_CALL_DECL (symbol_ref:DI ("bar") [flags 0x41] > > <function_dec > > l 0x7f4cff3c0b00 bar>) > > (nil)) > > (nil)) > > > > If the branch edge destination is a basic block with only a direct > > sibcall, change the jcc target to the sibcall target and decrement > > the destination basic block entry label use count. Even though the > > destination basic block is unused, it must be kept since it is required > > by RTL control flow check and JUMP_LABEL of the conditional jump can > > only point to a code label, not a code symbol. Dummy sibcall patterns > > are added so that sibcalls in basic blocks, whose entry label use count > > is 0, won't be generated. > > This reads like you are trying to get around some checks in RTL > control flow. So, either changes you are performing to RTX stream are > not allowed (these checks are here for a reason), or the > infrastructure is not (yet) prepared to handle this functionality.
The main issue is that because JUMP_LABEL of the conditional jump can point to a code label, not a code symbol, I have no choice but keep it even if it is unused. If the infrastructure allows a symbol reference in all places where a label reference is allowed, only x86 backend changes are needed. BTW, some targets, like arm, don't set use count on referenced labels. I will add a target hook to opt-out the zero use count label. > Either way, please discuss with infrastructure maintainers (CC'd) > first if the approach is correct and if these changes to RTX stream > are allowed by the infra. > > Thanks, > Uros. > > > > > Jump tables like > > > > foo: > > .cfi_startproc > > cmpl $4, %edi > > ja .L1 > > movl %edi, %edi > > jmp *.L4(,%rdi,8) > > .section .rodata > > .L4: > > .quad .L8 > > .quad .L7 > > .quad .L6 > > .quad .L5 > > .quad .L3 > > .text > > .L5: > > jmp bar3 > > .L3: > > jmp bar4 > > .L8: > > jmp bar0 > > .L7: > > jmp bar1 > > .L6: > > jmp bar2 > > .L1: > > ret > > .cfi_endproc > > > > can also be changed to: > > > > foo: > > .cfi_startproc > > cmpl $4, %edi > > ja .L1 > > movl %edi, %edi > > jmp *.L4(,%rdi,8) > > .section .rodata > > .L4: > > .quad bar0 > > .quad bar1 > > .quad bar2 > > .quad bar3 > > .quad bar4 > > .text > > .L1: > > ret > > .cfi_endproc > > > > After basic block reordering pass, jump tables look like: > > > > (jump_table_data 16 15 17 (addr_vec:DI [ > > (label_ref:DI 18) > > (label_ref:DI 22) > > (label_ref:DI 26) > > (label_ref:DI 30) > > (label_ref:DI 34) > > ])) > > ... > > (code_label 30 17 31 4 5 (nil) [1 uses]) > > (note 31 30 32 4 [bb 4] NOTE_INSN_BASIC_BLOCK) > > (call_insn/j 32 31 33 4 (call (mem:QI (symbol_ref:DI ("bar3") [flags 0x41] > > <function_decl 0x7f21be3c0e00 bar3>) [0 bar3 S1 A8]) > > (const_int 0 [0])) "j.c":15:13 1469 {sibcall_di} > > (expr_list:REG_CALL_DECL (symbol_ref:DI ("bar3") [flags 0x41] > > <function_decl 0x7f21be3c0e00 bar3>) > > (nil)) > > (nil)) > > > > If the jump table entry points to a target basic block with only a direct > > sibcall, change the entry to point to the sibcall target and decrement > > the target basic block entry label use count. If the target basic block > > isn't kept for JUMP_LABEL of the conditional tailcall, delete it if its > > entry label use count is 0. > > > > Update final_scan_insn_1 to skip a label if its use count is 0 and > > support symbol reference in jump table. Update create_trace_edges to > > skip symbol reference in jump table. > > > > H.J. Lu (2): > > x86: Add a pass to fold tail call > > x86: Fold sibcall targets into jump table > > > > gcc/config/i386/i386-features.cc | 274 +++++++++++++++++++++ > > gcc/config/i386/i386-passes.def | 1 + > > gcc/config/i386/i386-protos.h | 3 + > > gcc/config/i386/i386.cc | 12 + > > gcc/config/i386/i386.md | 57 ++++- > > gcc/config/i386/predicates.md | 4 + > > gcc/dwarf2cfi.cc | 7 +- > > gcc/final.cc | 26 +- > > gcc/testsuite/gcc.target/i386/pr14721-1a.c | 54 ++++ > > gcc/testsuite/gcc.target/i386/pr14721-1b.c | 37 +++ > > gcc/testsuite/gcc.target/i386/pr14721-1c.c | 37 +++ > > gcc/testsuite/gcc.target/i386/pr14721-2a.c | 58 +++++ > > gcc/testsuite/gcc.target/i386/pr14721-2b.c | 41 +++ > > gcc/testsuite/gcc.target/i386/pr14721-2c.c | 43 ++++ > > gcc/testsuite/gcc.target/i386/pr14721-3a.c | 56 +++++ > > gcc/testsuite/gcc.target/i386/pr14721-3b.c | 40 +++ > > gcc/testsuite/gcc.target/i386/pr14721-3c.c | 39 +++ > > gcc/testsuite/gcc.target/i386/pr47253-1a.c | 24 ++ > > gcc/testsuite/gcc.target/i386/pr47253-1b.c | 17 ++ > > gcc/testsuite/gcc.target/i386/pr47253-2a.c | 27 ++ > > gcc/testsuite/gcc.target/i386/pr47253-2b.c | 17 ++ > > gcc/testsuite/gcc.target/i386/pr47253-3a.c | 32 +++ > > gcc/testsuite/gcc.target/i386/pr47253-3b.c | 20 ++ > > gcc/testsuite/gcc.target/i386/pr47253-3c.c | 20 ++ > > gcc/testsuite/gcc.target/i386/pr47253-4a.c | 26 ++ > > gcc/testsuite/gcc.target/i386/pr47253-4b.c | 18 ++ > > gcc/testsuite/gcc.target/i386/pr47253-5.c | 15 ++ > > gcc/testsuite/gcc.target/i386/pr47253-6.c | 15 ++ > > gcc/testsuite/gcc.target/i386/pr47253-7a.c | 52 ++++ > > gcc/testsuite/gcc.target/i386/pr47253-7b.c | 36 +++ > > 30 files changed, 1097 insertions(+), 11 deletions(-) > > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-1a.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-1b.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-1c.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-2a.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-2b.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-2c.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-3a.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-3b.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr14721-3c.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-1a.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-1b.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-2a.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-2b.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-3a.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-3b.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-3c.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-4a.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-4b.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-5.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-6.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-7a.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr47253-7b.c > > > > -- > > 2.48.1 > > -- H.J.