[llvm-bugs] [Bug 123065] [AMDGPU][GISel] FMin fmax pattern not recognize

LLVM Bugs via llvm-bugs Wed, 15 Jan 2025 07:01:34 -0800

Issue	123065
Summary	[AMDGPU][GISel] FMin fmax pattern not recognize
Labels	llvm:globalisel
Assignees
Reporter	qcolombet

    The attached reproducer lowers with compares and selects with GISel whereas SDISel uses fmin and fmax resulting in a shorter and more efficient code sequence.
SDISel seems to perform the simplification as part of its IR building process.


# To Reproduce #

Download the attached reproducer or copy/paste the IR below. 
[repro.ll.txt](https://github.com/user-attachments/files/18426310/repro.ll.txt)

Then run:
```bash
llc -O3 -march=amdgcn -mcpu=gfx942  -mtriple amdgcn-amd-hmcsa -global-isel=<0|1> repro.ll -o -
```

# Result #

GISel:
```asm
	s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
	v_and_b32_e32 v1, 1, v1
	v_cmp_ne_u32_e32 vcc, 0, v1
	v_mov_b32_e32 v1, 0x57f0
	s_nop 0
	v_cndmask_b32_e32 v0, 0, v0, vcc
	v_cmp_le_f16_e32 vcc, v0, v1
	s_nop 1
	v_cndmask_b32_e32 v0, v1, v0, vcc
	v_cvt_f32_f16_e32 v0, v0
	v_cvt_i32_f32_e32 v2, v0
	v_mov_b64_e32 v[0:1], 0
	global_store_byte v[0:1], v2, off
	s_waitcnt vmcnt(0)
	s_setpc_b64 s[30:31]
```

SDISel:
```asm
	s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
	v_and_b32_e32 v1, 1, v1
	v_cmp_eq_u32_e32 vcc, 1, v1
	s_nop 1
	v_cndmask_b32_e32 v0, 0, v0, vcc
	v_max_f16_e32 v0, v0, v0
	v_min_f16_e32 v0, 0x57f0, v0
	v_cvt_i16_f16_e32 v2, v0
	v_mov_b64_e32 v[0:1], 0
	global_store_byte v[0:1], v2, off
	s_waitcnt vmcnt(0)
	s_setpc_b64 s[30:31]
```

# Note #

Input:
```llvm
target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9"
target triple = "amdgcn-amd-amdhsa"

define void @foo.bb848(<1 x half> %i888, <1 x i1> %0, <1 x i1> %1) {
newFuncRoot:
  %i924 = select <1 x i1> %0, <1 x half> %i888, <1 x half> zeroinitializer
  %.inv24 = fcmp ole <1 x half> %i924, splat (half 0xH57F0)
  %i932 = select <1 x i1> %.inv24, <1 x half> %i924, <1 x half> splat (half 0xH57F0)
  %i940 = fptosi <1 x half> %i932 to <1 x i8>
  store <1 x i8> %i940, ptr addrspace(1) null, align 1
  ret void
}
```

The problem was reduced to make it easier to debug, but the original issue was using a vector of size 4.

_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 123065] [AMDGPU][GISel] FMin fmax pattern not recognize

Reply via email to