| Issue |
170477
|
| Summary |
[InstCombine] Increased reg usage and performance drop on AMD GPUs
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
akadutta
|
Recent changes to InstCombine over the past several months have led to significant performance drop on AMD GPUs for memory bound applications, especially those with GEP-heavy memory access patterns. We have observed 70-80% slowdown for applications and up to 20x slowdown for individual kernels. Most of this performance drop can be attributed to the following changes:
- Splitting GEPs with multiple non-zero indices to multiple GEP instructions. This leads to more instructions and increased register pressure
- Leading zero index stripping: removes critical information, leading to less optimized alias analysis. This is especially pronounced for struct fields. Zero index stripping leads to less optimized vectorization and register handling
- Multi-Index restriction in visitGEPOfGEP: Prevents combining GEPs. GPU addressing in general is simpler than CPUs. Therefore, if we prevent combining multiple GEPs, and instead split them, it leads to more arithmetic instructions for GPUs
Overall, we’ve observed VGPR usage increase drastically leading to significant register spills, which impacts performance.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs