On Fri, Mar 30, 2012 at 5:43 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: > On Fri, Mar 30, 2012 at 4:15 PM, Richard Guenther > <richard.guent...@gmail.com> wrote: >> On Thu, Mar 29, 2012 at 5:25 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>> On Thu, Mar 29, 2012 at 6:14 PM, Richard Guenther >>> <richard.guent...@gmail.com> wrote: >>>> On Thu, Mar 29, 2012 at 12:10 PM, Bin.Cheng <amker.ch...@gmail.com> wrote: >>>>> On Thu, Mar 29, 2012 at 6:07 PM, Richard Guenther >>>>> <richard.guent...@gmail.com> wrote: >>>>>> On Thu, Mar 29, 2012 at 12:02 PM, Bin.Cheng <amker.ch...@gmail.com> >>>>>> wrote: >>>>>>> Hi, >>>>>>> Following is the tree dump of 094t.pre for a test program. >>>>>>> Question is loads of D.5375_12/D.5375_14 are redundant on path <bb2, >>>>>>> bb7, bb5, bb6>, >>>>>>> but why not lowered into basic block 3, where it is used. >>>>>>> >>>>>>> BTW, seems no tree pass handles this case currently. >>>>>> >>>>>> tree-ssa-sink.c should do this. >>>>>> >>>>> It does not work for me, I will double check and update soon. >>>> >>>> Well, "should" as in, it's the place to do it. And certainly the pass can >>>> sink >>>> loads, so this must be a missed optimization. >>>> >>> Curiously, it is said explicitly that "We don't want to sink loads from >>> memory." >>> in tree-ssa-sink.c function statement_sink_location, and the condition is >>> >>> if (stmt_ends_bb_p (stmt) >>> || gimple_has_side_effects (stmt) >>> || gimple_has_volatile_ops (stmt) >>> || (gimple_vuse (stmt) && !gimple_vdef (stmt)) >>> <-----------------check load >>> || (cfun->has_local_explicit_reg_vars >>> && TYPE_MODE (TREE_TYPE (gimple_assign_lhs (stmt))) == BLKmode)) >>> return false; >>> >>> I haven't found any clue about this decision in ChangeLogs. >> >> Ah, that's probably because usually you want to hoist loads and sink stores, >> separating them (like a scheduler would do). We'd want to restrict sinking >> of loads to sink into not post-dominated regions (thus where they end up >> being executed less times).
Hi Richard, I am testing a patch to sink load of memory to proper basic block. Everything goes fine except auto-vectorization, sinking of load sometime corrupts the canonical form of data references. I haven't touched auto-vec before and cannot tell whether it's good or bad to do sink before auto-vec. For example, the slp-cond-1.c <bb 3>: # i_39 = PHI <i_32(11), 0(2)> D.5150_5 = i_39 * 2; D.5151_10 = D.5150_5 + 1; D.5153_17 = a[D.5150_5]; D.5154_19 = b[D.5150_5]; if (D.5153_17 >= D.5154_19) goto <bb 9>; else goto <bb 4>; <bb 9>: d0_6 = d[D.5150_5]; <-----this is sunk from bb3 goto <bb 5>; <bb 4>: e0_8 = e[D.5150_5]; <-----this is sunk from bb3 <bb 5>: # d0_2 = PHI <d0_6(9), e0_8(4)> k[D.5150_5] = d0_2; D.5159_26 = a[D.5151_10]; D.5160_29 = b[D.5151_10]; if (D.5159_26 >= D.5160_29) goto <bb 10>; else goto <bb 6>; <bb 10>: d1_11 = d[D.5151_10]; <-----this is sunk from bb3 goto <bb 7>; <bb 6>: e1_14 = e[D.5151_10]; <-----this is sunk from bb3 <bb 7>: ....... I will look into auto-vect but not sure how to handle this case. Any comments? Thanks very much. -- Best Regards.