On 6/20/2022 5:56 AM, Richard Biener via Gcc-patches wrote:


Note one option would be to emit a multiply with { 1, -1, 1, -1 } on
GIMPLE where then targets could opt-in to handle this via a DFmode
negate via a combine pattern?  Not sure if this can be even done
starting from the vec-perm RTL IL.
FWIW, FP multiply is the same cost as FP add/sub on our target.

I fear whether (neg:V2DF (subreg:V2DF (reg:V4SF))) is a good idea
will heavily depend on the target CPU (not only the ISA).  For RISC-V
for example I think the DF lanes do not overlap with two SF lanes
(so same with gcn I think).
Absolutely.  I've regularly seen introduction of subregs like that ultimately result in the SUBREG_REG object getting dumped into memory rather than be allocated into a register.  It could well be a problem with our port, I haven't started chasing it down yet.

One such case where that came up recently was the addition of something like this to simplify-rtx.  Basically in some cases we can turn a VEC_SELECT into a SUBREG, so I had this little hack in simplify-rtx that I was playing with:
+      /* If we have a VEC_SELECT of a SUBREG try to change the SUBREG so
+        that we eliminate the VEC_SELECT.  */
+      if (GET_CODE (op0) == SUBREG
+         && subreg_lowpart_p (op0)
+         && VECTOR_MODE_P (GET_MODE (op0))
+         && GET_MODE_INNER (GET_MODE (op0)) == mode
+         && XVECLEN (trueop1, 0) == 1
+         && CONST_INT_P (XVECEXP (trueop1, 0, 0)))
+       {
+         return simplify_gen_subreg (mode, SUBREG_REG (op0), GET_MODE (SUBREG_REG (op0)), INTVAL (XVECEXP (trueop1, 0, 0)) * 8);
+       }

Seemed like a no-brainer win, but in reality it made things worse pretty consistently.

jeff

Reply via email to