On Fri, 2020-06-12 at 18:31 -0500, Segher Boessenkool wrote: > Hi! > > On Thu, Jun 11, 2020 at 11:22:33PM -0500, will schmidt wrote: > > Fix codegen implementation for the builtin > > vec_pack_to_short_fp32. > > +;; Convert two vector F32 to packed vector f16. > > +(define_expand "convert_4f32_8f16" > > + [(set (match_operand:V8HI 0 "register_operand" "=v") > > + (unspec:V8HI [(match_operand:V4SF 1 "register_operand" "v") > > + (match_operand:V4SF 2 "register_operand" "v")] > > + UNSPEC_CONVERT_4F32_8F16))] > > + "TARGET_P9_VECTOR" > > +{ > > + rtx rtx_tmp_hi = gen_reg_rtx (V4SImode); > > + rtx rtx_tmp_lo = gen_reg_rtx (V4SImode); > > + > > + emit_insn (gen_vsx_xvcvsphp (rtx_tmp_hi, operands[1] )); > > + emit_insn (gen_vsx_xvcvsphp (rtx_tmp_lo, operands[2] )); > > + if (BYTES_BIG_ENDIAN) > > + emit_insn (gen_altivec_vpkuwum (operands[0], rtx_tmp_lo, > > rtx_tmp_hi)); > > + else > > + emit_insn (gen_altivec_vpkuwum (operands[0], rtx_tmp_hi, > > rtx_tmp_lo)); > > + DONE; > > +}) > > Why is this different from the 8i16, which doesn't have the swap of > the > operands for BE?
The 8f16 implementation specifies the swap variation between LE/BE. I'm not sure of the 8i16 implementation was defined without that, or if this was just missed. I suspect it should have been there too. > > If the comment before the pattern would say how it orders the > elements, > that might clarify things. (This really is documenting the unspec, > but > there is no better place for documenting this afaics). Ok. I'll double-check and ensure i've got the right wording. > > > +;; Generate xvcvsphp > > +(define_insn "vsx_xvcvsphp" > > + [(set (match_operand:V4SI 0 "register_operand" "=wa") > > + (unspec:V4SI [(match_operand:V4SF 1 "vsx_register_operand" > > "wa")] > > + UNSPEC_VSX_XVCVSPHP))] > > (Should be a tab before the unspec as well, not eight spaces). I see a tab and 5 spaces. I'll double-check. > > > --- a/gcc/testsuite/gcc.target/powerpc/builtins-1-p9-runnable.c > > +++ b/gcc/testsuite/gcc.target/powerpc/builtins-1-p9-runnable.c > > @@ -1,25 +1,37 @@ > > /* { dg-do run { target { powerpc*-*-linux* && { lp64 && > > p9vector_hw } } } } */ > > Why the lp64 here? > > > /* { dg-require-effective-target powerpc_p9vector_ok } */ > > That is implied by p9vector_hw (or it should be, at least :-) ) Hmm, yes. Those are issues with the original test. I'll update. Thanks -Will > > > Segher