On Tue, 2023-12-12 at 19:14 +0800, Jiahao Xu wrote: > Define LOGICAL_OP_NON_SHORT_CIRCUIT as 0, for a short-circuit branch, use the > short-circuit operation instead of the non-short-circuit operation. > > This gives a 1.8% improvement in SPECCPU 2017 fprate on 3A6000.
In r14-15 we removed LOGICAL_OP_NON_SHORT_CIRCUIT definition because the default value (1 for all current LoongArch CPUs with branch_cost = 6) may reduce the number of conditional branch instructions. I guess here the problem is floating-point compare instruction is much more costly than other instructions but the fact is not correctly modeled yet. Could you try https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640012.html where I've raised fp_add cost (which is used for estimating floating- point compare cost) to 5 instructions and see if it solves your problem without LOGICAL_OP_NON_SHORT_CIRCUIT? If not I guess you can try increasing the floating-point comparison cost more in loongarch_rtx_costs: case UNLT: /* Branch comparisons have VOIDmode, so use the first operand's mode instead. */ mode = GET_MODE (XEXP (x, 0)); if (FLOAT_MODE_P (mode)) { *total = loongarch_cost->fp_add; Try to make it fp_add + something? return false; } *total = loongarch_binary_cost (x, COSTS_N_INSNS (1), COSTS_N_INSNS (4), speed); return true; If adjusting the cost model does not work I'd say this is a middle-end issue and we should submit a bug report. > gcc/ChangeLog: > > * config/loongarch/loongarch.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Define. > > gcc/testsuite/ChangeLog: > > * gcc.target/loongarch/short-circuit.c: New test. > > diff --git a/gcc/config/loongarch/loongarch.h > b/gcc/config/loongarch/loongarch.h > index f1350b6048f..880c576c35b 100644 > --- a/gcc/config/loongarch/loongarch.h > +++ b/gcc/config/loongarch/loongarch.h > @@ -869,6 +869,7 @@ typedef struct { > 1 is the default; other values are interpreted relative to that. */ > > #define BRANCH_COST(speed_p, predictable_p) loongarch_branch_cost > +#define LOGICAL_OP_NON_SHORT_CIRCUIT 0 > > /* Return the asm template for a conditional branch instruction. > OPCODE is the opcode's mnemonic and OPERANDS is the asm template for > diff --git a/gcc/testsuite/gcc.target/loongarch/short-circuit.c > b/gcc/testsuite/gcc.target/loongarch/short-circuit.c > new file mode 100644 > index 00000000000..bed585ee172 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/loongarch/short-circuit.c > @@ -0,0 +1,19 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -ffast-math -fdump-tree-gimple" } */ > + > +int > +short_circuit (float *a) > +{ > + float t1x = a[0]; > + float t2x = a[1]; > + float t1y = a[2]; > + float t2y = a[3]; > + float t1z = a[4]; > + float t2z = a[5]; > + > + if (t1x > t2y || t2x < t1y || t1x > t2z || t2x < t1z || t1y > t2z || t2y > < t1z) > + return 0; > + > + return 1; > +} > +/* { dg-final { scan-tree-dump-times "if" 6 "gimple" } } */ -- Xi Ruoyao <xry...@xry111.site> School of Aerospace Science and Technology, Xidian University