在 2025/10/29 上午9:15, Guo Jie 写道:
The current implementation of the fnmam4 instruction template requires
the third source operand to be assigned the same hard register as the
target operand, but the constraint is not documented in the instruction
manual or standard template definitions. The current constraint will
generate additional data dependencies and extra instructions.

gcc/ChangeLog:

        * config/loongarch/lasx.md (fnma<mode>4): Remove.
        * config/loongarch/lsx.md (fnma<mode>4): Remove.
        * config/loongarch/simd.md (fnma<mode>4): Simplify and correct.

gcc/testsuite/ChangeLog:

        * gcc.target/loongarch/fnmam4-vec.c: New test.
---
  gcc/config/loongarch/lasx.md                    | 10 ----------
  gcc/config/loongarch/lsx.md                     | 10 ----------
  gcc/config/loongarch/simd.md                    | 11 +++++++++++
  gcc/testsuite/gcc.target/loongarch/fnmam4-vec.c | 14 ++++++++++++++
  4 files changed, 25 insertions(+), 20 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/loongarch/fnmam4-vec.c

/* snip */
+;; <x>vfnmsub.{s/d}
+(define_insn "fnma<mode>4"
+  [(set (match_operand:FVEC 0 "register_operand" "=f")
+       (fma:FVEC (neg:FVEC (match_operand:FVEC 1 "register_operand" "f"))
+                 (match_operand:FVEC 2 "register_operand" "f")
+                 (match_operand:FVEC 3 "register_operand" "f")))]
+  ""
+  "<x>vfnmsub.<simdfmt>\t%<wu>0,%<wu>1,%<wu>2,%<wu>3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+

Hi, Guojie:

This is the same problem as the scalar fnma template.

Since the fnma behavior as  -a*b+c.

But the behavior of [x]vfnmsub is  -(a*b-c),

so we need to add the condition "!HONOR_SIGNED_ZEROS (<MODE>mode)".


Reply via email to