[llvm-bugs] [Bug 129401] Improve `llvm.ucmp.i8.i1` codegen

LLVM Bugs via llvm-bugs Sat, 01 Mar 2025 12:59:53 -0800

Issue	129401
Summary	Improve `llvm.ucmp.i8.i1` codegen
Labels	new issue
Assignees
Reporter	scottmcm

    (Context: I was making a rustc PR and accidentally regressed `bool::cmp` by having it use `llvm.ucmp`, thus this bug that it would be nice if `ucmp` just was smart about it.)


`llvm.ucmp.i8.i1(a, b)` is actually the same as just `zext(a) - zext(b)`: <https://alive2.llvm.org/ce/z/oHq3bh>

But today they don't codegen the same: <https://llvm.godbolt.org/z/nxWdYhvTo>

```llvm
define noundef range(i8 -1, 2) i8 @src(i1 noundef zeroext %a, i1 noundef zeroext %b) unnamed_addr {
start:
  %0 = call i8 @llvm.ucmp.i8.i1(i1 %a, i1 %b)
  ret i8 %0
}

define noundef range(i8 -1, 2) i8 @tgt(i1 noundef zeroext %a, i1 noundef zeroext %b) unnamed_addr {
start:
  %aa = zext i1 %a to i8
  %bb = zext i1 %b to i8
  %0 = sub nsw i8 %aa, %bb
  ret i8 %0
}
```
on x64 gives
```asm
src:                                    # @src
        cmp dil, sil
        seta    al
        sbb     al, 0
        ret
tgt: # @tgt
        mov     eax, edi
        sub al, sil
        ret
```

I don't know if it's better to InstSimplify the `ucmp` to `sext(b) + zext(a)` or to improve the codegen for the `i1` case, but either way, it'd be nice if the intrinsic worked optimally for `i1` in addition to the wider widths.

_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 129401] Improve `llvm.ucmp.i8.i1` codegen

Reply via email to