Issue 131531
Summary `llvm.fma.bf16` intrinsic is expanded incorrectly
Labels new issue
Assignees
Reporter beetrees
    Consider the following LLVM IR:
```llvm
define bfloat @do_fma(bfloat %a, bfloat %b, bfloat %c) {
 %res = call bfloat @llvm.fma.bf16(bfloat %a, bfloat %b, bfloat %c)
    ret bfloat %res
}
```

LLVM turns this into the equivalent of:
```llvm
define bfloat @do_fma(bfloat %a, bfloat %b, bfloat %c) {
 %a_f32 = fpext bfloat %a to float
    %b_f32 = fpext bfloat %b to float
 %c_f32 = fpext bfloat %c to float
    %res_f32 = call float @llvm.fma.f32(float %a_f32, float %b_f32, float %c_f32)
    %res = fptrunc float %res_f32 to bfloat
    ret bfloat %res
}
```

This is a miscompilation, however, as `float` does not have enough precision to do a fused-multiply-add for `bfloat` without double rounding becoming an issue. For instance: `do_fma(0x1.40p+127, 0x1.04p+0, 0x1.00p-133) = 0x1.46p+127`, but LLVM's lowering to `float` FMA gives an incorrect result of `0x1.44p+127`.

Just using `double` instead of `float` would also not be a correct lowering: it would give the same incorrect result as the example above (using the reasoning from https://github.com/llvm/llvm-project/issues/128450#issuecomment-2727540179, a 126 + 127 + 8 = 261-bit significand would be required for double rounding not to be a problem with this lowering). I suspect the best option here is to lower to a libcall instead.

Closely related to #98389/#128450
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to