Kito, Thanks for the reality-check: the subreg-expressions are getting in the way. I'll drop this from v2, as a permanent resolution for this will be a bit more involved.
Philipp. On Thu, 11 Nov 2021 at 17:42, Kito Cheng <kito.ch...@gmail.com> wrote: > > Hi Philipp: > > This testcase got wrong result with this patch even w/o > <bitmanip_optab>si3_sext pattern: > > #include <stdio.h> > > #define MAX(A, B) ((A) > (B) ? (A) : (B)) > > long long __attribute__((noinline, noipa)) > foo6(long long a, long long b, int c) > { > int xa = a; > int xb = b; > return MAX(MAX(xa, xb), c); > } > int main() { > long long a = 0x200000000ll; > long long b = 0x1ffffffffl; > int c = 10; > long long d = foo6(a, b, c); > printf ("%lld %lld %d = %lld\n", a, b, c, d); > return 0; > } > > On Fri, Nov 12, 2021 at 12:27 AM Kito Cheng <kito.ch...@gmail.com> wrote: > > > > IIRC it's not work even without sign extend pattern since I did similar > > experimental before (not for RISC-V, but same concept), I guess I need more > > time to test that. > > > > Philipp Tomsich <philipp.toms...@vrull.eu> 於 2021年11月12日 週五 00:18 寫道: > >> > >> Kito, > >> > >> Unless I am missing something, the problem is not the relaxation to > >> GPR, but rather the sign-extending pattern I had squashed into the > >> same patch. > >> If you disable "<bitmanip_optab>si3_sext", a sext.w will be have to be > >> emitted after the 'max' and before the return (or before the SImode > >> output is consumed as a DImode), pushing the REE opportunity to a > >> subsequent consumer (e.g. an addw). > >> > >> This will generate > >> foo6: > >> max a0,a0,a1 > >> sext.w a0,a0 > >> ret > >> which (assuming that the inputs to max are properly sign-extended > >> SImode values living in DImode registers) will be the same as > >> performing the two sext.w before the max. > >> > >> Having a second set of eyes on this is appreciated — let me know if > >> you agree and I'll revise, once I have collected feedback on the > >> remaining patches of the series. > >> > >> Philipp. > >> > >> > >> On Thu, 11 Nov 2021 at 17:00, Kito Cheng <kito.ch...@gmail.com> wrote: > >> > > >> > Hi Philipp: > >> > > >> > We can't pretend we have SImode min/max instruction without that > >> > semantic. > >> > Give this testcase, x86 and rv64gc print out 8589934592 8589934591 = 0, > >> > but with this patch and compile with rv64gc_zba_zbb -O3, the output > >> > become 8589934592 8589934591 = 8589934592 > >> > > >> > -------------Testcase--------------- > >> > #include <stdio.h> > >> > long long __attribute__((noinline, noipa)) > >> > foo6(long long a, long long b) > >> > { > >> > int xa = a; > >> > int xb = b; > >> > return (xa > xb ? xa : xb); > >> > } > >> > int main() { > >> > long long a = 0x200000000ll; > >> > long long b = 0x1ffffffffl; > >> > long long c = foo6(a, b); > >> > printf ("%lld %lld = %lld\n", a, b, c); > >> > return 0; > >> > } > >> > -------------------------------------- > >> > v64gc_zba_zbb -O3 w/o this patch: > >> > foo6: > >> > sext.w a1,a1 > >> > sext.w a0,a0 > >> > max a0,a0,a1 > >> > ret > >> > > >> > -------------------------------------- > >> > v64gc_zba_zbb -O3 w/ this patch: > >> > foo6: > >> > max a0,a0,a1 > >> > ret > >> > > >> > On Thu, Nov 11, 2021 at 10:10 PM Philipp Tomsich > >> > <philipp.toms...@vrull.eu> wrote: > >> > > > >> > > While min/minu/max/maxu instructions are provided for XLEN only, these > >> > > can safely operate on GPRs (i.e. SImode or DImode for RV64): SImode is > >> > > always sign-extended, which ensures that the XLEN-wide instructions > >> > > can be used for signed and unsigned comparisons on SImode yielding a > >> > > correct ordering of value. > >> > > > >> > > This commit > >> > > - relaxes the minmax pattern to express for GPR (instead of X only), > >> > > providing both a si3 and di3 expansion on RV64 > >> > > - adds a sign-extending form for thee si3 pattern for RV64 to all REE > >> > > to eliminate redundant extensions > >> > > - adds test-cases for both > >> > > > >> > > gcc/ChangeLog: > >> > > > >> > > * config/riscv/bitmanip.md: Relax minmax to GPR (i.e SImode or > >> > > DImode) on RV64. > >> > > * config/riscv/bitmanip.md (<bitmanip_optab>si3_sext): Add > >> > > pattern for REE. > >> > > > >> > > gcc/testsuite/ChangeLog: > >> > > > >> > > * gcc.target/riscv/zbb-min-max.c: Add testcases for SImode > >> > > operands checking that no redundant sign- or zero-extensions > >> > > are emitted. > >> > > > >> > > Signed-off-by: Philipp Tomsich <philipp.toms...@vrull.eu> > >> > > --- > >> > > > >> > > gcc/config/riscv/bitmanip.md | 14 +++++++++++--- > >> > > gcc/testsuite/gcc.target/riscv/zbb-min-max.c | 20 +++++++++++++++++--- > >> > > 2 files changed, 28 insertions(+), 6 deletions(-) > >> > > > >> > > diff --git a/gcc/config/riscv/bitmanip.md > >> > > b/gcc/config/riscv/bitmanip.md > >> > > index 000deb48b16..2a28f78f5f6 100644 > >> > > --- a/gcc/config/riscv/bitmanip.md > >> > > +++ b/gcc/config/riscv/bitmanip.md > >> > > @@ -260,13 +260,21 @@ (define_insn "bswap<mode>2" > >> > > [(set_attr "type" "bitmanip")]) > >> > > > >> > > (define_insn "<bitmanip_optab><mode>3" > >> > > - [(set (match_operand:X 0 "register_operand" "=r") > >> > > - (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r") > >> > > - (match_operand:X 2 "register_operand" > >> > > "r")))] > >> > > + [(set (match_operand:GPR 0 "register_operand" "=r") > >> > > + (bitmanip_minmax:GPR (match_operand:GPR 1 "register_operand" > >> > > "r") > >> > > + (match_operand:GPR 2 "register_operand" > >> > > "r")))] > >> > > "TARGET_ZBB" > >> > > "<bitmanip_insn>\t%0,%1,%2" > >> > > [(set_attr "type" "bitmanip")]) > >> > > > >> > > +(define_insn "<bitmanip_optab>si3_sext" > >> > > + [(set (match_operand:DI 0 "register_operand" "=r") > >> > > + (sign_extend:DI (bitmanip_minmax:SI (match_operand:SI 1 > >> > > "register_operand" "r") > >> > > + (match_operand:SI 2 "register_operand" > >> > > "r"))))] > >> > > + "TARGET_64BIT && TARGET_ZBB" > >> > > + "<bitmanip_insn>\t%0,%1,%2" > >> > > + [(set_attr "type" "bitmanip")]) > >> > > + > >> > > ;; orc.b (or-combine) is added as an unspec for the benefit of the > >> > > support > >> > > ;; for optimized string functions (such as strcmp). > >> > > (define_insn "orcb<mode>2" > >> > > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > >> > > b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > >> > > index f44c398ea08..7169e873551 100644 > >> > > --- a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > >> > > +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > >> > > @@ -1,5 +1,5 @@ > >> > > /* { dg-do compile } */ > >> > > -/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -O2" } */ > >> > > +/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64 -O2" } */ > >> > > > >> > > long > >> > > foo1 (long i, long j) > >> > > @@ -25,7 +25,21 @@ foo4 (unsigned long i, unsigned long j) > >> > > return i > j ? i : j; > >> > > } > >> > > > >> > > +unsigned int > >> > > +foo5(unsigned int a, unsigned int b) > >> > > +{ > >> > > + return a > b ? a : b; > >> > > +} > >> > > + > >> > > +int > >> > > +foo6(int a, int b) > >> > > +{ > >> > > + return a > b ? a : b; > >> > > +} > >> > > + > >> > > /* { dg-final { scan-assembler-times "min" 3 } } */ > >> > > -/* { dg-final { scan-assembler-times "max" 3 } } */ > >> > > +/* { dg-final { scan-assembler-times "max" 4 } } */ > >> > > /* { dg-final { scan-assembler-times "minu" 1 } } */ > >> > > -/* { dg-final { scan-assembler-times "maxu" 1 } } */ > >> > > +/* { dg-final { scan-assembler-times "maxu" 3 } } */ > >> > > +/* { dg-final { scan-assembler-not "zext.w" } } */ > >> > > +/* { dg-final { scan-assembler-not "sext.w" } } */ > >> > > -- > >> > > 2.32.0 > >> > >