Issue |
123079
|
Summary |
[AMDGPU][GISel] Missing FMAX3 use
|
Labels |
good first issue,
backend:AMDGPU,
llvm:globalisel
|
Assignees |
|
Reporter |
qcolombet
|
GISel fails to use the max3 (and probably min3) instruction on AMDGPU. Instead it uses a sequence of 2 max instructions.
SDISel gets this right.
I believe the AMDGPU port miss a port of the `SITargetLowering::performMinMaxCombine` optimization.
# To Reproduce #
Download the attached IR or copy/paste it from below.
Then run:
```bash
llc -O3 -march=amdgcn -mcpu=gfx942 -mtriple amdgcn-amd-hmcsa -global-isel=<0|1> repro.ll -o -
```
[repro.ll.txt](https://github.com/user-attachments/files/18427260/repro.ll.txt)
# Result #
GISel:
```asm
s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
v_max_f32_e32 v0, v0, v0
v_max_f32_e32 v0, 0, v0
v_mov_b32_e32 v4, v1
v_mov_b32_e32 v5, v2
v_max_f32_e32 v0, 0, v0
flat_store_dword v[4:5], v0
s_waitcnt vmcnt(0) lgkmcnt(0)
s_setpc_b64 s[30:31]
```
SDISel:
```asm
s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
v_mov_b32_e32 v3, v2
v_mov_b32_e32 v2, v1
v_max3_f32 v0, v0, 0, 0
flat_store_dword v[2:3], v0
s_waitcnt vmcnt(0) lgkmcnt(0)
s_setpc_b64 s[30:31]
```
GISel uses 3 `max` instructions where SDISel manages to do the same thing with just one `max3` instruction.
Note: The test case was automatically reduced hence the input values constants are not representative of the real workload.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs