Hi Richard, I realized thanks to Christophe Lyon that a shift was not right: the shift count is a number of bytes instead of a number of bits.
This extra patch fixes the problem. ChangeLog are as follows: *** gcc/ChangeLog *** 2014-09-26 Thomas Preud'homme <thomas.preudho...@arm.com> * tree-ssa-math-opts.c (find_bswap_or_nop_1): Fix creation of MARKER_BYTE_UNKNOWN markers when handling casts. *** gcc/testsuite/ChangeLog *** 2014-10-08 Thomas Preud'homme <thomas.preudho...@arm.com> * gcc.dg/optimize-bswaphi-1.c: New bswap pass test. diff --git a/gcc/testsuite/gcc.dg/optimize-bswaphi-1.c b/gcc/testsuite/gcc.dg/optimize-bswaphi-1.c index 3e51f04..18aba28 100644 --- a/gcc/testsuite/gcc.dg/optimize-bswaphi-1.c +++ b/gcc/testsuite/gcc.dg/optimize-bswaphi-1.c @@ -42,6 +42,20 @@ uint32_t read_be16_3 (unsigned char *data) return *(data + 1) | (*data << 8); } +typedef int SItype __attribute__ ((mode (SI))); +typedef int HItype __attribute__ ((mode (HI))); + +/* Test that detection of significant sign extension works correctly. This + checks that unknown byte marker are set correctly in cast of cast. */ + +HItype +swap16 (HItype in) +{ + return (HItype) (((in >> 0) & 0xFF) << 8) + | (((in >> 8) & 0xFF) << 0); +} + /* { dg-final { scan-tree-dump-times "16 bit load in target endianness found at" 3 "bswap" } } */ -/* { dg-final { scan-tree-dump-times "16 bit bswap implementation found at" 3 "bswap" { xfail alpha*-*-* arm*-*-* } } } */ +/* { dg-final { scan-tree-dump-times "16 bit bswap implementation found at" 1 "bswap" { target alpha*-*-* arm*-*-* } } } */ +/* { dg-final { scan-tree-dump-times "16 bit bswap implementation found at" 4 "bswap" { xfail alpha*-*-* arm*-*-* } } } */ /* { dg-final { cleanup-tree-dump "bswap" } } */ diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c index 3c6e935..2ef2333 100644 --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -1916,7 +1916,8 @@ find_bswap_or_nop_1 (gimple stmt, struct symbolic_number *n, int limit) if (!TYPE_UNSIGNED (n->type) && type_size > old_type_size && HEAD_MARKER (n->n, old_type_size)) for (i = 0; i < type_size - old_type_size; i++) - n->n |= MARKER_BYTE_UNKNOWN << (type_size - 1 - i); + n->n |= MARKER_BYTE_UNKNOWN + << ((type_size - 1 - i) * BITS_PER_MARKER); if (type_size < 64 / BITS_PER_MARKER) { regression testsuite run without regression on x86_64-linux-gnu and bswap tests all pass on arm-none-eabi target Is it ok for trunk? Best regards, Thomas > -----Original Message----- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: Wednesday, September 24, 2014 4:01 PM > To: Thomas Preud'homme > Cc: GCC Patches > Subject: Re: [PATCH] Fix PR63266: Keep track of impact of sign extension > in bswap > > On Tue, Sep 16, 2014 at 12:24 PM, Thomas Preud'homme > <thomas.preudho...@arm.com> wrote: > > Hi all, > > > > The fix for PR61306 disabled bswap when a sign extension is detected. > However this led to a test case regression (and potential performance > regression) in case where a sign extension happens but its effect is > canceled by other bit manipulation. This patch aims to fix that by having a > special marker to track bytes whose value is unpredictable due to sign > extension. If the final result of a bit manipulation doesn't contain any > such marker then the bswap optimization can proceed. > > Nice and simple idea. > > Ok. > > Thanks, > Richard. > > > *** gcc/ChangeLog *** > > > > 2014-09-15 Thomas Preud'homme <thomas.preudho...@arm.com> > > > > PR tree-optimization/63266 > > * tree-ssa-math-opts.c (struct symbolic_number): Add comment > about > > marker for unknown byte value. > > (MARKER_MASK): New macro. > > (MARKER_BYTE_UNKNOWN): New macro. > > (HEAD_MARKER): New macro. > > (do_shift_rotate): Mark bytes with unknown values due to sign > > extension when doing an arithmetic right shift. Replace hardcoded > > mask for marker by new MARKER_MASK macro. > > (find_bswap_or_nop_1): Likewise and adjust ORing of two > symbolic > > numbers accordingly. > > > > *** gcc/testsuite/ChangeLog *** > > > > 2014-09-15 Thomas Preud'homme <thomas.preudho...@arm.com> > > > > PR tree-optimization/63266 > > * gcc.dg/optimize-bswapsi-1.c (swap32_d): New bswap pass test. > > > > > > Testing: > > > > * Built an arm-none-eabi-gcc cross-compiler and used it to run the > testsuite on QEMU emulating Cortex-M3 without any regression > > * Bootstrapped on x86_64-linux-gnu target and testsuite was run > without regression > > > > > > Ok for trunk?