[llvm-bugs] [Bug 172045] Improve AMDGPU sqrt and inverse sqrt handling for bfloat

LLVM Bugs via llvm-bugs Fri, 12 Dec 2025 08:46:48 -0800

Issue	172045
Summary	Improve AMDGPU sqrt and inverse sqrt handling for bfloat
Labels	backend:AMDGPU, missed-optimization
Assignees
Reporter	arsenm

    The code for targets without v_sqrt_bf16 and v_rsq_bf16 is quite poor: https://github.com/llvm/llvm-project/pull/172044


This looks like it is casting to float, and performing the full precision float expansion. This can use one of the faster options, and be closer to the f16 expansion.

_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 172045] Improve AMDGPU sqrt and inverse sqrt handling for bfloat

Reply via email to