https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81174
--- Comment #3 from Thomas Preud'homme <thopre01 at gcc dot gnu.org> --- (In reply to Jakub Jelinek from comment #2) > Simplified testcase: > static inline unsigned > bar (unsigned x) > { > return ((x & 0x000000ff) << 24) | ((x & 0x0000ff00) << 8) > | ((x & 0x00ff0000) >> 8) | ((x & 0xff000000) >> 24); > } > > unsigned > foo (unsigned p, unsigned q) > { > p &= ~0x1ffff80U; > p |= bar ((q << 7) & 0x1ffff80U); > return p; > } > > The problem is that with |= instead of += the reassoc pass is invoked first > and reassociates the many | operands and then the bswap pass can't recognize > the pattern. > So, either we should consider moving the bswap pass 6 passes earlier (i.e. > before reassoc1), or if we want to catch that we'd need to be able to > recognize | operands unrelated to the bswap pattern mixed with | operands > related to that, and replace just the ones related to the bswap and leave > the others in. Two things need to happen for bswap to catch the byteswap in this testcase: 1) do some reassociation to keep expression involving a single source 2) look for byteswap by stopping at all level of the recursion. Let me illustrate with the testcase reduced by Jakub. Here's the gimple I get with this testcase before bswap: foo (unsigned int p, unsigned int q) { unsigned int _1; unsigned int _2; unsigned int _7; unsigned int _9; unsigned int _10; unsigned int _12; unsigned int _13; unsigned int _15; unsigned int _17; unsigned int _18; unsigned int _19; <bb 2> [100.00%]: p_4 = p_3(D) & 0xFE00007F; _1 = q_5(D) << 7; _2 = _1 & 0x01FFFF80; _7 = _2 << 24; _9 = _2 << 8; _10 = _9 & 0x00FF0000; _12 = _2 >> 8; _13 = _12 & 0x0000FF00; _15 = _2 >> 24; _17 = _7 | _15; _18 = p_4 | _17; _19 = _10 | _18; p_8 = _13 | _19; return p_8; } 1) is needed to ignore expressions based on p when looking at the statement defining p_8. Currently because p_8 ORs _13 (based on q) and _19 (based on p and q) the match would fail. Similarly for _19 and _18. Then when looking at _17 some statements would be missing to get a byteswap. So it would be necessary to start at p_8 but realize that | is associative and thus reassociate by only following expressions based on q. bswap need to be extended so that the analysis for one statement could return several results (here one for the expression based on p and one for the expression based on q) as well as the operation that links these results (OR). 2) is needed to stop once statement defining _2 is reached. Currently bswap would continue to recurse through _1 but due to the bitwise AND the pattern would not be a byteswap of q and the match would fail. Some form of memoization would be needed to not make this expensive.