[llvm-bugs] [Bug 128237] AVX-512 Mask registers being used when it's not needed

LLVM Bugs via llvm-bugs Fri, 21 Feb 2025 13:53:03 -0800

Issue	128237
Summary	AVX-512 Mask registers being used when it's not needed
Labels	new issue
Assignees
Reporter	Whatcookie

    I've been running into some odd assembly generated by RPCS3's SPU LLVM backend.


In short: the AVX-512 code is slower than the AVX2 code due to compare into mask instructions being used, when the compare into vector instructions would be faster.

https://godbolt.org/z/dcjTKKaWj

In the FCGT3 function, both AVX2 and AVX-512 targets are able to use the compare into register instructions, as expected. In the FCGT2 function, where the only difference is fcmp ugt, inplace of fcmp ogt, LLVM is opting  to use the mask registers, which is inconvenient since we're emulating instructions which compare into the vector registers.

```
        vpminud xmm0, xmm0, xmmword ptr [rdi + rcx]
        vcmpnleps       xmm0, xmm0, xmmword ptr [rdi + rax]
```

```
        vpminud xmm0, xmm0, dword ptr [rip + .LCPI1_0]{1to4}
        vcmpnleps       k0, xmm0, xmmword ptr [rdi + rax]
 vpmovm2d        xmm0, k0
```

_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 128237] AVX-512 Mask registers being used when it's not needed

Reply via email to