Issue |
144561
|
Summary |
[AMDGPU][True16] si-fix-sgpr-copies: invalid sgpr_lo16 copy destination
|
Labels |
backend:AMDGPU
|
Assignees |
|
Reporter |
frederik-h
|
Running pass `si-fix-sgpr-copies` on the following machine ir changes the `%7:vgpr_16` operand of the second ` V_CNDMASK_B16_t16_e64 ` into a register of type `sgpr_lo16` which is invalid.
## test.mir
```
---
name: sgpr_copy_invalid_type
tracksRegLiveness: true
body: |
bb.0:
%0:vgpr_32 = IMPLICIT_DEF
%1:sreg_32 = IMPLICIT_DEF
%2:sreg_32_xm0_xexec = IMPLICIT_DEF
%3:sgpr_lo16 = COPY undef %0.lo16
%4:sreg_32 = COPY undef %3
%5:sreg_32 = S_AND_B32 undef %4, killed undef %1, implicit-def dead $scc
%6:sgpr_32 = S_CVT_F32_F16 killed undef %5, implicit $mode
%7:vgpr_16 = COPY undef %3
%8:vgpr_16 = V_CNDMASK_B16_t16_e64 0, undef %7, 0, 1, killed undef %2, 0, implicit $exec
S_ENDPGM 0
bb.2:
successors: %bb.0(0x80000000)
S_CMP_LG_U32 killed undef %4, killed undef %4, implicit-def $scc
S_CBRANCH_SCC1 %bb.0, implicit undef $scc
S_ENDPGM 0
...
```
## llc invocation
`llc -mtriple=amdgcn -mcpu=gfx1150 -mattr=+real-true16 -print-changed=cdiff -run-pass=si-fix-sgpr-copies -debug-_only_=si-fix-sgpr-copies -verify-machineinstrs test.mir`
## Error message from machine verifier
```
*** Bad machine code: Illegal virtual register for instruction ***
- function: sgpr_copy_invalid_type
- basic block: %bb.0 (0x557e9fcd29b8)
- instruction: %8:vgpr_16 = V_CNDMASK_B16_t16_e64 0, undef %7:sgpr_lo16, 0, 1, killed undef %2:sreg_32_xm0_xexec, 0, implicit $exec
- operand 2: undef %7:sgpr_lo16
Expected a VS_16 register, but got a SGPR_LO16 register
```
## Debug output from si-fix-sgpr-copies
```
V2S copy %3:sgpr_lo16 = COPY undef %0.lo16:vgpr_32
is being turned to v_readfirstlane_b32 Score: 3
*** IR Dump After SI Fix SGPR copies (si-fix-sgpr-copies) on sgpr_copy_invalid_type ***
# Machine code for function sgpr_copy_invalid_type: IsSSA, NoPHIs, TracksLiveness
```
```diff
bb.0:
; predecessors: %bb.1
%0:vgpr_32 = IMPLICIT_DEF
%1:sreg_32 = IMPLICIT_DEF
%2:sreg_32_xm0_xexec = IMPLICIT_DEF
- %3:sgpr_lo16 = COPY undef %0.lo16:vgpr_32
- %4:sreg_32 = COPY undef %3:sgpr_lo16
+ %10:vgpr_16 = IMPLICIT_DEF
+ %9:vgpr_32 = REG_SEQUENCE %0.lo16:vgpr_32, %subreg.lo16, %10:vgpr_16, %subreg.hi16
+ %3:sreg_32_xm0 = V_READFIRSTLANE_B32 %9:vgpr_32, implicit $exec
+ %4:sreg_32 = COPY undef %3:sreg_32_xm0
%5:sreg_32 = S_AND_B32 undef %4:sreg_32, killed undef %1:sreg_32, implicit-def dead $scc
%6:sgpr_32 = S_CVT_F32_F16 killed undef %5:sreg_32, implicit $mode
- %7:vgpr_16 = COPY undef %3:sgpr_lo16
- %8:vgpr_16 = V_CNDMASK_B16_t16_e64 0, undef %7:vgpr_16, 0, 1, killed undef %2:sreg_32_xm0_xexec, 0, implicit $exec
+ %7:sgpr_lo16 = COPY undef %3:sreg_32_xm0
+ %8:vgpr_16 = V_CNDMASK_B16_t16_e64 0, undef %7:sgpr_lo16, 0, 1, killed undef %2:sreg_32_xm0_xexec, 0, implicit $exec
S_ENDPGM 0
bb.1:
successors: %bb.0(0x80000000); %bb.0(100.00%)
S_CMP_LG_U32 killed undef %4:sreg_32, killed undef %4:sreg_32, implicit-def $scc
S_CBRANCH_SCC1 %bb.0, implicit undef $scc
S_ENDPGM 0
# End machine code for function sgpr_copy_invalid_type.
```
## Related commit
[[AMDGPU][True16][CodeGen] readfirstlane for vgpr16 copy to sgpr32 ](https://github.com/llvm/llvm-project/commit/d4706e17f55d058316d1cc3ce86bee14ad11f5d6) by @broxigarchen
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs