On 08/29/2017 05:36 PM, Michael Clark wrote:
> We’re investigating an issue with redundant sign-extension instructions being 
> emitted with the riscv backend. Firstly I would like to state that riscv is 
> possibly a unique backend with respect to its canonical sign-extended 
> register form due to the following:
> 
> - POINTERS_EXTEND_UNSIGNED is false, and the canonical form after 32-bit 
> operations on RV64 is sign-extended not zero-extended.
> 
> - PROMOTE_MODE is defined on RV64 such that SI mode values are promoted to 
> signed DI mode values holding SI mode subregs
> 
> - RISC-V does not have register aliases for these different modes, rather 
> different instructions are selected to operate on different register widths.

This is identical to Alpha.

> - The 32-bit instructions sign-extend results. e.g. all shifts, add, sub, etc.
> 
> - Some instructions such as logical ops only have full word width variants 
> (AND, OR, XOR) but these instructions maintain canonical form as there is no 
> horizontal movement of bits.

Alpha only had 32-bit instructions for for add/sub/mul that also sign-extend,
so there is a little less scope for optimization.

> testcase 1:
> 
> $ cat bswap.c
> unsigned bswap(unsigned p) {
>       return ((p >> 24) & 0x000000ff) | ((p << 8 ) & 0x00ff0000) | 
>        ((p >> 8 ) & 0x0000ff00) | ((p << 24) & 0xff000000);
> }
> $ cat bswap.s
> bswap2:
>       sllw    a3,a0,24
>       srlw    a5,a0,24
>       sllw    a4,a0,8
>       or      a5,a5,a3
>       li      a3,16711680
>       and     a4,a4,a3
>       or      a5,a5,a4
>       li      a4,65536
>       add     a4,a4,-256
>       srlw    a0,a0,8
>       and     a0,a0,a4
>       or      a0,a5,a0
>       sext.w  a0,a0   # redundant
>       ret

I think the easiest solution to this is for combine to notice when IOR has
operands with non-zero-bits that do not overlap, convert the operation to ADD.
That allows the final two insns to fold to "addw" and the compiler need do no
further analysis.

That may even allow better code generation in other ways, such as a
three-operand LEA insn on two-operand architectures (x86, s390x).  Or for a
slightly different test case, perhaps further folding to a shift-add insn.

> $ cat rshift.c
> unsigned rs24(unsigned rs1) { return rs1 >> 24; }
> $ cat rshift.s
> rs24:
>       sllw    a0,a0,24
>       sext.w  a0,a0   # redundant
>       ret

That seems like a missing check somewhere (combine? simplify-rtx? both?) for
SUBREG_PROMOTED_SIGNED_P.  Since Alpha didn't have a 32-bit shift you're in new
territory for this one.


r~

Reply via email to