On 4/25/23 14:20, Roger Sayle wrote:

This patch adds support for xstormy16's swpb (swap bytes) and swpw (swap
words) instructions.  The most obvious application of these to implement
the __builtin_bswap16 and __builtin_bswap32 intrinsics.

Currently, __builtin_bswap16 is implemented as:
foo:    mov r7,r2
         shl r7,#8
         shr r2,#8
         or r2,r7
         ret

but with this patch becomes:
foo:    swpb r2
         ret

Likewise, __builtin_bswap32 now becomes:
foo:    swpb r2 | swpb r3 | swpw r2,r3
         ret

Finally, the swpw instruction on its own can be used to exchange
two word mode registers without a temporary, so a new pattern and
peephole2 have been added to catch this.  As described in the
PR rtl-optimization/106518, register allocation can (in theory)
be more efficient on targets that provide a swap/exchange instruction.
The slightly unusual swap<mode> naming matches that used in i386.md.

This patch has been tested by building a cross-compiler to xstormy16-elf
from x86_64-pc-linux-gnu, and confirming the new test cases pass.
Ok for mainline?


2024-04-25  Roger Sayle  <ro...@nextmovesoftware.com>

gcc/ChangeLog
        * config/stormy16/stormy16.md (bswaphi2): New define_insn.
        (bswapsi2): New define_insn.
        (swaphi): New define_insn to exchange two registers (swpw).
        (define_peephole2): Recognize exchange of registers as swaphi.

gcc/testsuite/ChangeLog
        * gcc.target/xstormy16/bswap16.c: New test case.
        * gcc.target/xstormy16/bswap32.c: Likewise.
        * gcc.target/xstormy16/swpb.c: Likewise.
        * gcc.target/xstormy16/swpw-1.c: Likewise.
        * gcc.target/xstormy16/swpw-2.c: Likewise.
OK. And like prior patches, if it causes any problems in wider testing, we'll know ~24hrs after the bits go in.

jeff

Reply via email to