>> Am 08.12.2022 um 11:56 schrieb Jose E. Marchesi via Gcc-patches >> <gcc-patches@gcc.gnu.org>: >> >> The expand_expr_divmod function in expr.cc attempts to optimize cases >> where both arguments of a division/modulus are known to be positive >> when interpreted as signed. In these cases, both signed division and >> unsigned division will raise the same value, and therefore the >> cheapest option can be used. >> >> In order to determine what is the cheaper option in the current >> target, expand_expr_divmod actually expands both a signed divmod and >> an unsigned divmod using local "sequences": >> >> start_sequence (); >> ... >> expand_divmod (... signed ...); >> ... >> end_sequence (); >> >> start_sequence (); >> ... >> expand_divmod (... unsigned ...); >> ... >> end_sequence (); >> >> And then compares the cost of each generated sequence, choosing the >> best one. Finally, it emits the selected expanded sequence and >> returns the rtx with the result. >> >> This approach has a caveat. Some targets do not provide instructions >> for division/modulus instructions. In the case of BPF, it provides >> unsigned division/modulus, but not signed division/modulus. >> >> In these cases, the expand_divmod tries can contain calls to funcalls. >> For example, in BPF: >> >> start_sequence (); >> ... >> expand_divmod (... signed ...); -> This generates funcall to __divdi3 >> ... >> end_sequence (); >> >> start_sequence (); >> ... >> expand_divmod (... unsigned ...); -> This generates direct `div' insn. >> ... >> end_sequence (); >> >> The problem is that when a funcall is expanded, an accompanying global >> symbol definition is written in the output stream: >> >> .global __divdi3 >> >> And this symbol definition remains in the compiled assembly file, even >> if the sequence using the direct `div' instruction above is used. >> >> This is particularly bad in BPF, because the kernel bpf loader chokes >> on the spurious symbol __divdi3 and makes the resulting BPF object >> unloadable (note that BPF objects are not linked before processed by >> the kernel.) >> >> In order to fix this, this patch modifies expand_expr_divmod in the >> following way: >> >> - When trying each sequence (signed, unsigned) the expand_divmod calls >> are told to _not_ use libcalls if everything else fails. This is >> done by passing OPTAB_WIDEN as the `methods' argument. (Before it >> was using the default value OPTAB_LIB_WIDEN.) >> >> - If any of the tried expanded sequences contain a funcall, then the >> optimization is not attempted. > > How do libcalls appear in iff you specify OPTABS_WIDEN only? Doesn’t > that allow to simplify this and also use the sequence without a > libcall?
If you pass OPTABS_WIDEN only then libcalls are not an option and (as far as I can tell) expand_divmod returns NULL if a libcall is the only possibility. > Richard > >> >> A couple of BPF tests are also added to make sure this doesn't break >> at any point in the future. >> >> Tested in bpf-unknown-none and x86_64-linux-gnu. >> Regtested in x86_64-linux-gnu. No regressions. >> >> gcc/ChangeLog >> >> * expr.cc (expand_expr_divmod): Avoid side-effects of trying >> sequences involving funcalls in optimization. >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/bpf/divmod-funcall-1.c: New test. >> * gcc.target/bpf/divmod-funcall-2.c: Likewise. >> --- >> gcc/expr.cc | 44 +++++++++++-------- >> .../gcc.target/bpf/divmod-funcall-1.c | 8 ++++ >> .../gcc.target/bpf/divmod-funcall-2.c | 8 ++++ >> 3 files changed, 41 insertions(+), 19 deletions(-) >> create mode 100644 gcc/testsuite/gcc.target/bpf/divmod-funcall-1.c >> create mode 100644 gcc/testsuite/gcc.target/bpf/divmod-funcall-2.c >> >> diff --git a/gcc/expr.cc b/gcc/expr.cc >> index d9407432ea5..4d4be5d7bda 100644 >> --- a/gcc/expr.cc >> +++ b/gcc/expr.cc >> @@ -9168,32 +9168,38 @@ expand_expr_divmod (tree_code code, machine_mode >> mode, tree treeop0, >> do_pending_stack_adjust (); >> start_sequence (); >> rtx uns_ret = expand_divmod (mod_p, code, mode, treeop0, treeop1, >> - op0, op1, target, 1); >> + op0, op1, target, 1, OPTAB_WIDEN); >> rtx_insn *uns_insns = get_insns (); >> end_sequence (); >> start_sequence (); >> rtx sgn_ret = expand_divmod (mod_p, code, mode, treeop0, treeop1, >> - op0, op1, target, 0); >> + op0, op1, target, 0, OPTAB_WIDEN); >> rtx_insn *sgn_insns = get_insns (); >> end_sequence (); >> - unsigned uns_cost = seq_cost (uns_insns, speed_p); >> - unsigned sgn_cost = seq_cost (sgn_insns, speed_p); >> >> - /* If costs are the same then use as tie breaker the other other >> - factor. */ >> - if (uns_cost == sgn_cost) >> - { >> - uns_cost = seq_cost (uns_insns, !speed_p); >> - sgn_cost = seq_cost (sgn_insns, !speed_p); >> - } >> - >> - if (uns_cost < sgn_cost || (uns_cost == sgn_cost && unsignedp)) >> - { >> - emit_insn (uns_insns); >> - return uns_ret; >> - } >> - emit_insn (sgn_insns); >> - return sgn_ret; >> + /* Do not try to optimize if any of the sequences tried above >> + resulted in a funcall. */ >> + if (uns_ret && sgn_ret) >> + { >> + unsigned uns_cost = seq_cost (uns_insns, speed_p); >> + unsigned sgn_cost = seq_cost (sgn_insns, speed_p); >> + >> + /* If costs are the same then use as tie breaker the other >> + other factor. */ >> + if (uns_cost == sgn_cost) >> + { >> + uns_cost = seq_cost (uns_insns, !speed_p); >> + sgn_cost = seq_cost (sgn_insns, !speed_p); >> + } >> + >> + if (uns_cost < sgn_cost || (uns_cost == sgn_cost && unsignedp)) >> + { >> + emit_insn (uns_insns); >> + return uns_ret; >> + } >> + emit_insn (sgn_insns); >> + return sgn_ret; >> + } >> } >> return expand_divmod (mod_p, code, mode, treeop0, treeop1, >> op0, op1, target, unsignedp); >> diff --git a/gcc/testsuite/gcc.target/bpf/divmod-funcall-1.c >> b/gcc/testsuite/gcc.target/bpf/divmod-funcall-1.c >> new file mode 100644 >> index 00000000000..dffb1506f06 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/bpf/divmod-funcall-1.c >> @@ -0,0 +1,8 @@ >> +/* { dg-do compile } */ >> +/* { dg-final { scan-assembler-not "__divdi3" } } */ >> + >> +int >> +foo (unsigned int len) >> +{ >> + return ((unsigned long)len) * 234 / 5; >> +} >> diff --git a/gcc/testsuite/gcc.target/bpf/divmod-funcall-2.c >> b/gcc/testsuite/gcc.target/bpf/divmod-funcall-2.c >> new file mode 100644 >> index 00000000000..41e8e40c35c >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/bpf/divmod-funcall-2.c >> @@ -0,0 +1,8 @@ >> +/* { dg-do compile } */ >> +/* { dg-final { scan-assembler-not "__moddi3" } } */ >> + >> +int >> +foo (unsigned int len) >> +{ >> + return ((unsigned long)len) * 234 % 5; >> +} >> -- >> 2.30.2 >>