On Wed, 12 Aug 2020, Roger Sayle wrote:
This patch is inspired by a small code fragment in comment #3 of
bugzilla PR rtl-optimization/94804. That snippet appears almost
unrelated to the topic of the PR, but recognizing __builtin_bswap64
from two __builtin_bswap32 calls, seems like a clever/useful trick.
GCC's optabs.c contains the inverse logic to expand bswap64 by
IORing two bswap32 calls, so this transformation/canonicalization
is safe, even on targets without suitable optab support. But
on x86_64, the swap64 of the test case becomes a single instruction.
This patch has been tested on x86_64-pc-linux-gnu with a "make
bootstrap" and a "make -k check" with no new failures.
Ok for mainline?
Your tests seem to assume that int has 32 bits and long 64.
+ (if (operand_equal_p (@0, @2, 0)
Why not reuse @0 instead of introducing @2 in the pattern? Similarly, it
may be a bit shorter to reuse @1 instead of a new @3 (I don't think the
tricks with @@ will be needed here).
+ && types_match (TREE_TYPE (@0), uint64_type_node)
that seems very specific. What goes wrong with a signed type for instance?
+(simplify
+ (bit_ior:c
+ (lshift
+ (convert (BUILT_IN_BSWAP16 (convert (bit_and @0
+ INTEGER_CST@1))))
+ (INTEGER_CST@2))
+ (convert (BUILT_IN_BSWAP16 (convert (rshift @3
+ INTEGER_CST@4)))))
I didn't realize we kept this useless bit_and when casting to a smaller
type. We probably get a different pattern on 16-bit targets, but a pattern
they do not match won't hurt them.
--
Marc Glisse