On Wed, Jun 11, 2014 at 10:27 AM, Thomas Preud'homme <thomas.preudho...@arm.com> wrote: >> From: Richard Biener [mailto:richard.guent...@gmail.com] >> Sent: Wednesday, June 11, 2014 4:09 PM > >> > >> > Oh I see. Doing it there would mean instead of two independent >> > operations you'd do the best combination possible, is that right? >> >> Yes (but probably it's not worth the trouble). > > I understood that. > >> >> > I'm tempted to use a simple heuristic such as comparing the >> > number of loads before and after, adding one if the load is >> > unaligned. So in the above example, supposing that there is >> > some computation done around x[0] before the return line, >> > we'd have 2 loads before Vs 2 x is unaligned and we would >> > cancel the optimization. If x is aligned the optimization would >> > proceed. >> > >> > Do you thing this approach is also too much trouble or would >> > not work? >> >> I'm not sure. For noop-loads I'd keep them unconditionally, even if >> unaligned. I'd disable unaligned-load + bswap for now. People >> interested and sitting on such a target should do the measurements >> and decide if it's worth the trouble (is arm affected?). > > Yes it is. > >> >> But I see that the code currently does not limit itself to single-use >> chains and thus may end up keeping the whole original code life >> by unrelated uses. So a good thing would be to impose proper >> restrictions here. For example, in find_bswap_or_nop_1 do >> >> if (TREE_CODE (rhs1) != SSA_NAME >> || !has_single_use (rhs1)) > > But then the example in gcc.dg/optimize-bswapdi-2.c would not > work for instance. Same for swap32_b in gcc.dg/optimize-bswapsi-1.c > > To make it work you'd need to check that there is no use outside the > sets of statements that form the bitwise OR operation you are > considering.
Yes, of course (also for generic shuffles that may have duplicate entries). Richard. > Best regards, > > Thomas > >