Issue 123079
Summary [AMDGPU][GISel] Missing FMAX3 use
Labels good first issue, backend:AMDGPU, llvm:globalisel
Assignees
Reporter qcolombet
    GISel fails to use the max3 (and probably min3) instruction on AMDGPU. Instead it uses a sequence of 2 max instructions.
SDISel gets this right.

I believe the AMDGPU port miss a port of the `SITargetLowering::performMinMaxCombine` optimization.

# To Reproduce #

Download the attached IR or copy/paste it from below.
Then run:
```bash
llc -O3 -march=amdgcn -mcpu=gfx942 -mtriple amdgcn-amd-hmcsa -global-isel=<0|1> repro.ll -o -
```
[repro.ll.txt](https://github.com/user-attachments/files/18427260/repro.ll.txt)

# Result #

GISel:
```asm
	s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
	v_max_f32_e32 v0, v0, v0
	v_max_f32_e32 v0, 0, v0
	v_mov_b32_e32 v4, v1
	v_mov_b32_e32 v5, v2
	v_max_f32_e32 v0, 0, v0
	flat_store_dword v[4:5], v0
	s_waitcnt vmcnt(0) lgkmcnt(0)
	s_setpc_b64 s[30:31]
```

SDISel:
```asm
	s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
	v_mov_b32_e32 v3, v2
	v_mov_b32_e32 v2, v1
	v_max3_f32 v0, v0, 0, 0
	flat_store_dword v[2:3], v0
	s_waitcnt vmcnt(0) lgkmcnt(0)
	s_setpc_b64 s[30:31]
```

GISel uses 3 `max` instructions where SDISel manages to do the same thing with just one `max3` instruction.

Note: The test case was automatically reduced hence the input values constants are not representative of the real workload.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to