[llvm-bugs] [Bug 129608] s390x: `vec_subc_u128` is not recognized

LLVM Bugs via llvm-bugs Mon, 03 Mar 2025 16:22:07 -0800

Issue	129608
Summary	s390x: `vec_subc_u128` is not recognized
Labels	new issue
Assignees
Reporter	folkertdev

    https://godbolt.org/z/EjoWhj8MM

I expect these to optimize to the same output, but they do not:


```llvm
define noundef <16 x i8> @vec_subc_u128_intrinsic(<16 x i8> %a, <16 x i8> %b) unnamed_addr {
start:
  %0 = bitcast <16 x i8> %a to i128
  %1 = bitcast <16 x i8> %b to i128
  %_3 = tail call noundef i128 @llvm.s390.vscbiq(i128 noundef %0, i128 noundef %1) #3
  %2 = bitcast i128 %_3 to <16 x i8>
  ret <16 x i8> %2
}

define <16 x i8> @vec_subc_u128_manual(<16 x i8> %a, <16 x i8> %b) unnamed_addr {
start:
  %0 = bitcast <16 x i8> %a to i128
  %1 = bitcast <16 x i8> %b to i128
  %_8.1 = icmp uge i128 %0, %1
  %_5 = zext i1 %_8.1 to i128
  %2 = bitcast i128 %_5 to <16 x i8>
  ret <16 x i8> %2
}

declare i128 @llvm.s390.vscbiq(i128, i128) unnamed_addr #2
```

The equivalent with `vec_addc_u128` does get optimized into just a `vaccq` instruction. For the subtraction here we get this:

```asm
vec_subc_u128_intrinsic:
        vscbiq  %v24, %v24, %v26
 br      %r14

.LCPI1_0:
        .quad   0
        .quad 1
vec_subc_u128_manual:
        veclg   %v24, %v26
        jlh .LBB1_2
        vchlgs  %v0, %v26, %v24
.LBB1_2:
        ipm     %r0
 xilf    %r0, 268435456
        afi     %r0, 1879048192
        vlvgp %v0, %r0, %r0
        larl    %r1, .LCPI1_0
        vl      %v1, 0(%r1), 3
        vrepib  %v2, 31
        vsrlb   %v0, %v0, %v2
        vsrl %v0, %v0, %v2
        vn      %v24, %v0, %v1
        br %r14
```

which is unfortunate.

in the addition case, we see

```llvm
define <16 x i8> @vec_addc_u128_manual(<16 x i8> %a, <16 x i8> %b) unnamed_addr {
start:
  %0 = bitcast <16 x i8> %a to i128
  %1 = bitcast <16 x i8> %b to i128
  %2 = tail call { i128, i1 } @llvm.uadd.with.overflow.i128(i128 %0, i128 %1)
  %_7.1 = extractvalue { i128, i1 } %2, 1
  %_5 = zext i1 %_7.1 to i128
  %3 = bitcast i128 %_5 to <16 x i8>
  ret <16 x i8> %3
}
```

so here the `@llvm.uadd.with.overflow.i128` is explicitly there. That won't work for the signed overflowing subtraction, which is too clever and just performs a compare.

_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 129608] s390x: `vec_subc_u128` is not recognized

Reply via email to