On Sat, 15 Aug 2020, Roger Sayle wrote:
Here's version #2 of the patch to recognize bswap32 and bswap64
incorporating your suggestions and feedback. The test cases now confirm
the transformation is applied when int is 32 bits and long is 64 bits,
and should pass otherwise; the patterns now reuse (more) capturing
groups, and the patterns have been made more generic to allow the
ultimate type to be signed or unsigned (hence there are now two new
gcc.dg tests).
Alas my efforts to allow the input argument to be signed, and use
fold_convert to coerce it to the correct type before calling
__builtin_bswap failed, with the error messages:
You can't use fold_convert for that (well, maybe if you restricted the
transformation to GENERIC), but if I understand correctly, you are trying
to do
(convert (BUILT_IN_BSWAP64 (convert:uint64_type_node @1))))))
? (untested)
From: Marc Glisse <marc.gli...@inria.fr>
+(simplify
+ (bit_ior:c
+ (lshift
+ (convert (BUILT_IN_BSWAP16 (convert (bit_and @0
+ INTEGER_CST@1))))
+ (INTEGER_CST@2))
+ (convert (BUILT_IN_BSWAP16 (convert (rshift @3
+ INTEGER_CST@4)))))
I didn't realize we kept this useless bit_and when casting to a smaller
type.
I was confused when I wrote that and thought we were converting from int
to uint16_t, but bswap16 actually takes an int on x86_64, probably because
of the calling convention, so we are converting from unsigned int to int.
Having implementation details like the calling convention appear here in
the intermediate language complicates things a bit. Can we assume that it
is fine to build a call to bswap32/bswap64 taking uint32_t/uint64_t and
that only bswap16 can be affected? Do most targets have a similar-enough
calling convention that this transformation also works on them? It looks
like aarch64 / powerpc64le / mips64el would like for bswap16->bswap32 a
transformation of the same form as the one you wrote for bswap32->bswap64.
I was wondering what would happen if I start from an int instead of an
unsigned int.
f (int x)
{
short unsigned int _1;
short unsigned int _2;
short unsigned int _3;
int _5;
int _7;
unsigned int _8;
unsigned int _9;
int _10;
<bb 2> [local count: 1073741824]:
_7 = x_4(D) & 65535;
_1 = __builtin_bswap16 (_7);
_8 = (unsigned int) x_4(D);
_9 = _8 >> 16;
_10 = (int) _9;
_2 = __builtin_bswap16 (_10);
_3 = _1 | _2;
_5 = (int) _3;
return _5;
}
Handling this in the same transformation with a pair of convert12? and
some tests should be doable, but it gets complicated enough that it is
fine to postpone that.
--
Marc Glisse