Segher was kind enough to give me an offline review on his vacation. I made some small changes and committed the following. Thanks!
Bill 2018-01-02 Bill Schmidt <wschm...@linux.vnet.ibm.com> * config/rs6000/rs6000-p8swap.c (swap_feeds_both_load_and_store): New function. (rs6000_analyze_swaps): Mark a web unoptimizable if it contains a swap associated with both a load and a store. Index: gcc/config/rs6000/rs6000-p8swap.c =================================================================== --- gcc/config/rs6000/rs6000-p8swap.c (revision 256110) +++ gcc/config/rs6000/rs6000-p8swap.c (working copy) @@ -328,6 +328,38 @@ insn_is_swap_p (rtx insn) return 1; } +/* Return 1 iff UID, known to reference a swap, is both fed by a load + and a feeder of a store. */ +static unsigned int +swap_feeds_both_load_and_store (swap_web_entry *insn_entry) +{ + rtx insn = insn_entry->insn; + struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn); + df_ref def, use; + struct df_link *link = 0; + rtx_insn *load = 0, *store = 0; + bool fed_by_load = 0; + bool feeds_store = 0; + + FOR_EACH_INSN_INFO_USE (use, insn_info) + { + link = DF_REF_CHAIN (use); + load = DF_REF_INSN (link->ref); + if (insn_is_load_p (load) && insn_is_swap_p (load)) + fed_by_load = 1; + } + + FOR_EACH_INSN_INFO_DEF (def, insn_info) + { + link = DF_REF_CHAIN (def); + store = DF_REF_INSN (link->ref); + if (insn_is_store_p (store) && insn_is_swap_p (store)) + feeds_store = 1; + } + + return fed_by_load && feeds_store; +} + /* Return TRUE if insn is a swap fed by a load from the constant pool. */ static bool const_load_sequence_p (swap_web_entry *insn_entry, rtx insn) @@ -2030,6 +2062,14 @@ rs6000_analyze_swaps (function *fun) && !insn_entry[i].is_swap && !insn_entry[i].is_swappable) root->web_not_optimizable = 1; + /* If we have a swap that is both fed by a permuting load + and a feeder of a permuting store, then the optimization + isn't appropriate. (Consider vec_xl followed by vec_xst_be.) */ + else if (insn_entry[i].is_swap && !insn_entry[i].is_load + && !insn_entry[i].is_store + && swap_feeds_both_load_and_store (&insn_entry[i])) + root->web_not_optimizable = 1; + /* If we have permuting loads or stores that are not accompanied by a register swap, the optimization isn't appropriate. */ else if (insn_entry[i].is_load && insn_entry[i].is_swap) On 12/19/17 7:59 AM, Bill Schmidt wrote: > Hi, > > Carl Love is working on a patch to add missing flavors of the > vec_xst_be intrinsic and test cases to cover all flavors. He ran > into a latent bug in swap optimization that this patch addresses. > Swap optimization operates on the principle that a computation can > have swaps removed if all permuting loads are accompanied by a swap, > all permuting stores are accompanied by a swap, and the remaining > vector computations are lane-insensitive or easy to adjust if lanes > are swapped across doublewords. > > A new problem that arises with vec_xl_be and vec_xst_be is that the > same swap may accompany both a load and a store, so that removing > that swap changes the semantics of the program. Suppose we have a > vec_xl from *(a+b) followed by a vec_xst_be to *(c+d). The code at > expand time then looks like: > > lxvd2x x,a,b > xxswapd x,x > stxvd2x x,c,d > > The first two instructions are generated by vec_xl, while the last > is generated by vec_xst_be. Swap optimization removes the xxswapd > because this sequence satisfies the rules, but now we have the same > result as if the vec_xst_be were actually a vec_xst. > > To avoid this, this patch marks a computation as unoptimizable if > it contains a swap that is both fed by a permuting load and feeds > into a permuting store. > > Bootstrapped and tested on powerpc64le-unknown-linux-gnu for POWER8 > with no regressions. Carl has verified this fixes the related > problems in his test cases under development. Is this okay for > trunk? > > Thanks, > Bill > >