On Tue, Aug 11, 2020 at 08:01:50PM -0500, Segher Boessenkool wrote: > Hi! > > On Tue, Aug 11, 2020 at 12:23:07PM -0400, Michael Meissner wrote: > > + /* See if we can use the ISA 3.1 min/max/compare instructions for IEEE > > + 128-bit floating point. At present, don't worry about doing > > conditional > > + moves with different types for the comparison and movement (unlike > > SF/DF, > > + where you can do a conditional test between double and use float as > > the > > + if/then parts. */ > > Why is that? That makes the code quite different too (harder to > review), but also, it could just use existing code more.
It is a combinatorial expansion problem. In order to do: double cmove (double x, double y, float a, float b) { return (a == b) ? x : y; } You need two pair of SFDF iterators. You use SFDF for the target, and SFDF2 for the comparison. (define_insn_and_split "*mov<SFDF:mode><SFDF2:mode>cc_p9" [(set (match_operand:SFDF 0 "vsx_register_operand" "=&<SFDF:Fv>,<SFDF:Fv>") (if_then_else:SFDF (match_operator:CCFP 1 "fpmask_comparison_operator" [(match_operand:SFDF2 2 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>") (match_operand:SFDF2 3 "vsx_register_operand" "<SFDF2:Fv>,<SFDF2:Fv>")]) (match_operand:SFDF 4 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>") (match_operand:SFDF 5 "vsx_register_operand" "<SFDF:Fv>,<SFDF:Fv>"))) (clobber (match_scratch:V2DI 6 "=0,&wa"))] "TARGET_P9_MINMAX" "#" "" [(set (match_dup 6) (if_then_else:V2DI (match_dup 1) (match_dup 7) (match_dup 8))) (set (match_dup 0) (if_then_else:SFDF (ne (match_dup 6) (match_dup 8)) (match_dup 4) (match_dup 5)))] { if (GET_CODE (operands[6]) == SCRATCH) operands[6] = gen_reg_rtx (V2DImode); operands[7] = CONSTM1_RTX (V2DImode); operands[8] = CONST0_RTX (V2DImode); } [(set_attr "length" "8") (set_attr "type" "vecperm")]) This means there are 4 versions of the pattern: 1: DF target, DF comparison 2: DF target, SF comparison 3: SF target, DF comparison 4: SF target, SF comparison The dueling iterators was added in May 26th, 2016. I believe we had separate insns for the 4 cases for fsel before I added the xscmpeqdp, etc. support. To grow this for IEEE 128, you would need two iterators, each with 4 elements (DF, SF, KF, and optionally TF). Thus you would need 16 patterns to represent all of the patterns. I can do this, I just didn't think it was worth it. In addition, it becomes more involved due to constraints. Currently for SF/DF you essentially want to use any vector register at this point (when I added it in 2016, there was still the support for limiting whether SF/DF could use the Altivec registers). But for IEEE 128-bit fp types, you want "v" for the registers used for comparison. You might want "v" for the contitional move result, or you might want "wa". I looked at combining the SF/DF and IEEE 128-bit cases, but it was becoming too complex to describe these cases. It is doable, but it makes the diffs even harder to read. > > +;; IEEE 128-bit min/max > > +(define_insn "s<minmax><mode>3" > > + [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v") > > + (fp_minmax:IEEE128 > > + (match_operand:IEEE128 1 "altivec_register_operand" "v") > > + (match_operand:IEEE128 2 "altivec_register_operand" "v")))] > > + "TARGET_FLOAT128_HW && TARGET_POWER10 && FLOAT128_IEEE_P (<MODE>mode)" > > + "xs<minmax>cqp %0,%1,%2" > > + [(set_attr "type" "fp") > > + (set_attr "size" "128")]) > > So why do we want the asymmetrical xsmincqp instead of xsminqp? This > should be documented at the very least. (We also should have min/max > that work correctly without -ffast-math, of course :-( ). We don't have an xsminqp or xsmaxqp instruction in power10. We only have xsmincqp and xsmaxcqp instructions. > > +;; IEEE 128-bit conditional move. At present, don't worry about doing > > +;; conditional moves with different types for the comparison and movement > > +;; (unlike SF/DF, where you can do a conditional test between double and > > use > > +;; float as the if/then parts. > > (Unmatched brackets). So why is this? "At present" doesn't belong in > the code btw, but in your patch description. Typo. > > +(define_insn_and_split "*mov<mode>cc_hardware" > > Please don't use meaningless names like this. > > > +(define_insn "*xxsel<mode>" Yes, this was more due to the fact that I was cloning the SF/DF code, and that doesn't have XXSEL support. I'll look at that. > This already exists for other vector modes. Put it together with that > please. Same for all other insn patterns. > > Ideally you can make those one pattern then, even. > > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-2.c > > @@ -0,0 +1,70 @@ > > +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ > > +/* { dg-require-effective-target power10_ok } */ > > +/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */ > > +/* { dg-final { scan-assembler-not "xscmpuqp" } } */ > > +/* { dg-final { scan-assembler "xscmpeqqp" } } */ > > +/* { dg-final { scan-assembler "xscmpgtqp" } } */ > > +/* { dg-final { scan-assembler "xscmpgeqp" } } */ > > +/* { dg-final { scan-assembler "xsmaxcqp" } } */ > > +/* { dg-final { scan-assembler "xsmincqp" } } */ > > +/* { dg-final { scan-assembler "xxsel" } } */ > > Use \m \M please. Don't ask for powerpc* in gcc.target/powerpc/ (it is > implied there). Why does it need lp64? There should be some test for > __float128, instead. > > All in all, this is very hard to review :-( -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797