> From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: Wednesday, June 11, 2014 4:09 PM
> > > > Oh I see. Doing it there would mean instead of two independent > > operations you'd do the best combination possible, is that right? > > Yes (but probably it's not worth the trouble). I understood that. > > > I'm tempted to use a simple heuristic such as comparing the > > number of loads before and after, adding one if the load is > > unaligned. So in the above example, supposing that there is > > some computation done around x[0] before the return line, > > we'd have 2 loads before Vs 2 x is unaligned and we would > > cancel the optimization. If x is aligned the optimization would > > proceed. > > > > Do you thing this approach is also too much trouble or would > > not work? > > I'm not sure. For noop-loads I'd keep them unconditionally, even if > unaligned. I'd disable unaligned-load + bswap for now. People > interested and sitting on such a target should do the measurements > and decide if it's worth the trouble (is arm affected?). Yes it is. > > But I see that the code currently does not limit itself to single-use > chains and thus may end up keeping the whole original code life > by unrelated uses. So a good thing would be to impose proper > restrictions here. For example, in find_bswap_or_nop_1 do > > if (TREE_CODE (rhs1) != SSA_NAME > || !has_single_use (rhs1)) But then the example in gcc.dg/optimize-bswapdi-2.c would not work for instance. Same for swap32_b in gcc.dg/optimize-bswapsi-1.c To make it work you'd need to check that there is no use outside the sets of statements that form the bitwise OR operation you are considering. Best regards, Thomas