On Wed, 12 Aug 2020, Roger Sayle wrote:

This patch is inspired by a small code fragment in comment #3 of
bugzilla PR rtl-optimization/94804.  That snippet appears almost
unrelated to the topic of the PR, but recognizing __builtin_bswap64
from two __builtin_bswap32 calls, seems like a clever/useful trick.
GCC's optabs.c contains the inverse logic to expand bswap64 by
IORing two bswap32 calls, so this transformation/canonicalization
is safe, even on targets without suitable optab support.  But
on x86_64, the swap64 of the test case becomes a single instruction.


This patch has been tested on x86_64-pc-linux-gnu with a "make
bootstrap" and a "make -k check" with no new failures.
Ok for mainline?

Your tests seem to assume that int has 32 bits and long 64.

+  (if (operand_equal_p (@0, @2, 0)

Why not reuse @0 instead of introducing @2 in the pattern? Similarly, it may be a bit shorter to reuse @1 instead of a new @3 (I don't think the tricks with @@ will be needed here).

+       && types_match (TREE_TYPE (@0), uint64_type_node)

that seems very specific. What goes wrong with a signed type for instance?

+(simplify
+  (bit_ior:c
+    (lshift
+      (convert (BUILT_IN_BSWAP16 (convert (bit_and @0
+                                                  INTEGER_CST@1))))
+      (INTEGER_CST@2))
+    (convert (BUILT_IN_BSWAP16 (convert (rshift @3
+                                               INTEGER_CST@4)))))

I didn't realize we kept this useless bit_and when casting to a smaller type. We probably get a different pattern on 16-bit targets, but a pattern they do not match won't hurt them.

--
Marc Glisse

Reply via email to