Issue 129978
Summary AMDGPU should form s_addk_i32 and s_mulk_i32 earlier
Labels backend:AMDGPU
Assignees
Reporter arsenm
    Currently AMDGPU forms these in SIShrinkInstructions, which runs twice. The first run sets a register hint that the input and the result should be the same register. The second post-regalloc runs performs the s_add_i32 -> s_addk_i32 / s_mul_i32 -> s_mulk_i32 transform if the source and destination registers are the same.

This kind of works, but could be better. When touching the allocator and copy related patches, I regularly see regressions where the formation of the K forms is broken. Using tied operands to begin with is a much stronger hint to the allocator to prefer the tied form. This is how we handle FMA vs. fmac cases.

This can be done in 3 parts:

1. Handle s_addk_i32 -> s_add_i32 and s_mulk_i32 -> s_mul_i32 in convertToThreeAddress
2. Teach FoldImmediate to form the K forms from the non-K forms
3. Modify constant selection patterns to directly emit the K forms for valid immediate
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to