[llvm-bugs] [Bug 49738] New: [Matrix] LowerMatrixIntrinsics should preserve existing fast-math flags during lowering

via llvm-bugs Sat, 27 Mar 2021 05:32:18 -0700

https://bugs.llvm.org/show_bug.cgi?id=49738


            Bug ID: 49738
           Summary: [Matrix] LowerMatrixIntrinsics should preserve
                    existing fast-math flags during lowering
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Scalar Optimizations
          Assignee: unassignedb...@nondot.org
          Reporter: florian_h...@apple.com
                CC: llvm-bugs@lists.llvm.org

Currently LowerMatrixIntrinsics does not add existing fast-math flags from
matrix intrinsics & other instructions with shape information to the lowered
instructions.

For the example below, `opt -lower-matrix-intrinsics` creates fmuladd/fadd/fmul
instructions without `fast`: https://godbolt.org/z/1o48Tx1bP

This also means we fail to fold redundant FP instructions. In this case, we end
up with fmuladd calls with operands that are zero and can be simplified with
`fast`:  https://godbolt.org/z/5oWs1oh91

define <4 x float> @foo(<4 x float> %m, float %x, float %y) {
  %i1 = insertelement <4 x float> <float poison, float 0.000000e+00, float
0.000000e+00, float poison>, float %x, i64 0
  %i2 = insertelement <4 x float> %i1, float %y, i64 3
  %res = tail call fast <4 x float> @llvm.matrix.multiply.v4f32.v4f32.v4f32(<4
x float> %m, <4 x float> %i1, i32 2, i32 2, i32 2)
  %res.2 = fadd fast <4 x float> %res, %m
  ret <4 x float> %res
}

declare <4 x float> @llvm.matrix.multiply.v4f32.v4f32.v4f32(<4 x float>, <4 x
float>, i32, i32, i32)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 49738] New: [Matrix] LowerMatrixIntrinsics should preserve existing fast-math flags during lowering

Reply via email to