Issue 94025
Summary AVX-512 mask registers spill to stack when GPRs are available
Labels new issue
Assignees
Reporter embg
    I hit some cases where LLVM spills mask registers to the stack in memory-bound code, where extra loads and stores to the stack are quite expensive. Some of these spills may be avoidable through better scheduling, but I think improving the performance of the spills (when unavoidable) would also be useful.

I noticed that GCC spills these registers to GPRs -- could LLVM do the same?

Here is a minimal example:
* clang spills mask registers to the stack: https://godbolt.org/z/e9nGb6z8s
* gcc spills mask registers to GPRs: https://godbolt.org/z/h46cdje7a

Here is a more realistic scenario, where better scheduling could theoretically eliminate the spills (but doesn't): https://godbolt.org/z/Pz1dsrh53

For the realistic scenario, I also observed that GCC spills to GPRs, while LLVM spills to the stack.

I chatted offline with @MatzeB, who indicated that this would probably require a lot of work in the register allocator. I understand if the gains aren't large enough to justify this work. But I still thought it might be useful to share this data point with the community.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to