On 6/20/2022 5:56 AM, Richard Biener via Gcc-patches wrote:
Note one option would be to emit a multiply with { 1, -1, 1, -1 } on
GIMPLE where then targets could opt-in to handle this via a DFmode
negate via a combine pattern? Not sure if this can be even done
starting from the vec-perm RTL IL.
FWIW, FP multiply is the same cost as FP add/sub on our target.
I fear whether (neg:V2DF (subreg:V2DF (reg:V4SF))) is a good idea
will heavily depend on the target CPU (not only the ISA). For RISC-V
for example I think the DF lanes do not overlap with two SF lanes
(so same with gcn I think).
Absolutely. I've regularly seen introduction of subregs like that
ultimately result in the SUBREG_REG object getting dumped into memory
rather than be allocated into a register. It could well be a problem
with our port, I haven't started chasing it down yet.
One such case where that came up recently was the addition of something
like this to simplify-rtx. Basically in some cases we can turn a
VEC_SELECT into a SUBREG, so I had this little hack in simplify-rtx that
I was playing with:
+ /* If we have a VEC_SELECT of a SUBREG try to change the SUBREG so
+ that we eliminate the VEC_SELECT. */
+ if (GET_CODE (op0) == SUBREG
+ && subreg_lowpart_p (op0)
+ && VECTOR_MODE_P (GET_MODE (op0))
+ && GET_MODE_INNER (GET_MODE (op0)) == mode
+ && XVECLEN (trueop1, 0) == 1
+ && CONST_INT_P (XVECEXP (trueop1, 0, 0)))
+ {
+ return simplify_gen_subreg (mode, SUBREG_REG (op0), GET_MODE
(SUBREG_REG (op0)), INTVAL (XVECEXP (trueop1, 0, 0)) * 8);
+ }
Seemed like a no-brainer win, but in reality it made things worse pretty
consistently.
jeff