On Thu, 2020-09-24 at 19:40 -0500, Segher Boessenkool wrote: > On Thu, Sep 24, 2020 at 11:04:38AM -0500, will schmidt wrote: > > [PATCH 2/2, rs6000] VSX load/store rightmost element operations > > > > Hi, > > This adds support for the VSX load/store rightmost element > > operations. > > This includes the instructions lxvrbx, lxvrhx, lxvrwx, lxvrdx, > > stxvrbx, stxvrhx, stxvrwx, stxvrdx; And the builtins > > vec_xl_sext() /* vector load sign extend */ > > vec_xl_zext() /* vector load zero extend */ > > vec_xst_trunc() /* vector store truncate */. > > > > Testcase results show that the instructions added with this patch > > show > > up at low/no optimization (-O0), with a number of those being > > replaced > > with other load and store instructions at higher optimization > > levels. > > For consistency I've left the tests at -O0. > > > > Regtested OK for Linux on power8,power9 targets. Sniff-regtested > > OK on > > power10 simulator. > > OK for trunk? > > > > Thanks, > > -Will > > > > gcc/ChangeLog: > > * config/rs6000/altivec.h (vec_xl_zest, vec_xl_sext, > > vec_xst_trunc): New > > defines. > > vec_xl_zext (no humour there :-) ).
Lol.. one of them slipped through.. my muscle memory struggled on typing these.. :-) > > > +BU_P10V_OVERLOAD_X (SE_LXVRX, "se_lxvrx") > > +BU_P10V_OVERLOAD_X (ZE_LXVRX, "ze_lxvrx") > > +BU_P10V_OVERLOAD_X (TR_STXVRX, "tr_stxvrx") > > I'm not a fan of the cryptic names. I guess I'll get used to them ;- > ) > > > + if (op0 == const0_rtx) > > + addr = gen_rtx_MEM (blk ? BLKmode : tmode, op1); > > That indent is broken. > > > + else > > + { > > + op0 = copy_to_mode_reg (mode0, op0); > > And so is this. Should be two spaces, not three. > > > + addr = gen_rtx_MEM (blk ? BLKmode : smode, > > + gen_rtx_PLUS (Pmode, op1, op0)); > > "gen_rtx_PLUS" should line up with "blk". > > > + if (sign_extend) > > + { > > + rtx discratch = gen_reg_rtx (DImode); > > + rtx tiscratch = gen_reg_rtx (TImode); > > More broken indentation. (And more later.) > > > + // emit the lxvr*x insn. > > Use only /* comments */ please, don't mix them. Emit with a capital > E. > > > + pat = GEN_FCN (icode) (tiscratch, addr); > > + if (! pat) > > No space after "!" (or any other unary op other than casts and sizeof > and the like). > > > + // Emit a sign extention from QI,HI,WI to double. > > "extension" willdo, thanks > > > +;; Store rightmost element into store_data > > +;; using stxvrbx, stxvrhx, strvxwx, strvxdx. > > +(define_insn "vsx_stxvr<wd>x" > > + [(set > > + (match_operand:INT_ISA3 0 "memory_operand" "=Z") > > + (truncate:INT_ISA3 (match_operand:TI 1 > > "vsx_register_operand" "wa")))] > > + "TARGET_POWER10" > > + "stxvr<wd>x %1,%y0" > > %x1 I think? I'll doublecheck. > > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/powerpc/vsx-load-element-extend- > > char.c > > @@ -0,0 +1,168 @@ > > +/* > > + * Test of vec_xl_sext and vec_xl_zext (load into rightmost > > + * vector element and zero/sign extend). */ > > + > > +/* { dg-do compile {target power10_ok} } */ > > +/* { dg-do run {target power10_hw} } */ > > +/* { dg-require-effective-target power10_ok } */ > > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ > > If you dg_require it, why test it on the "dg-do compile" line? It > will > *work* with it of course, but it is puzzling :-) I've had both compile-time and run-time versions of the test. In this case I wanted to try to handle both, so compile when I can compile it, and run when I can run it, etc. If that combo doesn't work the way I expect it to, i'll need to split them out into separate tests. > > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/powerpc/vsx-load-element-extend- > > int.c > > @@ -0,0 +1,165 @@ > > +/* > > + * Test of vec_xl_sext and vec_xl_zext (load into rightmost > > + * vector element and zero/sign extend). */ > > + > > +/* { dg-do compile {target power10_ok} } */ > > +/* { dg-do run {target power10_hw} } */ > > +/* { dg-require-effective-target power10_ok } */ > > +/* { dg-options "-mdejagnu-cpu=power10 -O0" } */ > > Please comment here what that -O0 is for? So that we still know when > we > read it decades from now ;-) I've got it commented at least once, I'll make sure to get all the instances covered. > > > +/* { dg-final { scan-assembler-times {\mlxvrwx\M} 2 } } */ > > +/* { dg-final { scan-assembler-times {\mlwax\M} 0 } } */ > > Maybe all of {\mlwa} here? lwax was sufficient for what I sniff-tested. I'll double-check. Thanks, -Will > > > Segher