On Mon, Jan 30, 2017 at 6:19 PM, Aldy Hernandez <al...@redhat.com> wrote: > On 01/30/2017 10:03 AM, Richard Biener wrote: >> >> On Fri, Jan 27, 2017 at 12:20 PM, Aldy Hernandez <al...@redhat.com> wrote: >>> >>> On 01/26/2017 07:29 AM, Richard Biener wrote: >>>> >>>> >>>> On Thu, Jan 26, 2017 at 1:04 PM, Aldy Hernandez <al...@redhat.com> >>>> wrote: >>>>> >>>>> >>>>> On 01/24/2017 07:23 AM, Richard Biener wrote: >>> >>> >>> >>>> Your initial on-demand approach is fine to catch some of the cases but >>>> it >>>> will not handle those for which we'd need post-dominance. >>>> >>>> I guess we can incrementally add that. >>> >>> >>> >>> No complaints from me. >>> >>> This is my initial on-demand approach, with a few fixes you've commented >>> on >>> throughout. >>> >>> As you can see, there is still an #if 0 wrt to using your suggested >>> conservative handling of memory loads, which I'm not entirely convinced >>> of, >>> as it pessimizes gcc.dg/loop-unswitch-1.c. If you feel strongly about >>> it, I >>> can enable the code again. >> >> >> It is really required -- fortunately loop-unswitch-1.c is one of the cases >> where >> the post-dom / always-executed bits help . The comparison is inside the >> loop header and thus always executed when the loop enters, so inserrting >> it >> on the preheader edge is fine. > > > Left as is then. > >> >>> Also, I enhanced gcc.dg/loop-unswitch-1.c to verify that we're actually >>> unswitching something. It seems kinda silly that we have various >>> unswitch >>> tests, but we don't actually check whether we have unswitched anything. >> >> >> Heh. It probably was added for an ICE... >> >>> This test was the only one in the *unswitch*.c set that I saw was >>> actually >>> being unswitched. Of course, if you don't agree with my #if 0 above, I >>> will >>> remove this change to the test. >>> >>> Finally, are we even guaranteed to unswitch in loop-unswitch-1.c across >>> architectures? If not, again, I can remove the one-liner. >> >> >> I think so. > > > Left as well. > >> >>> >>> How does this look for trunk? >> >> >> With a unswitch-local solution I meant to not add new files but put the >> defined_or_undefined class (well, or rather a single function...) into >> tree-ssa-loop-unswitch.c. > > > Done. > >> >> @@ -138,7 +141,7 @@ tree_may_unswitch_on (basic_block bb, struct loop >> *loop) >> { >> /* Unswitching on undefined values would introduce undefined >> behavior that the original program might never exercise. */ >> - if (ssa_undefined_value_p (use, true)) >> + if (defined_or_undefined->is_maybe_undefined (use)) >> return NULL_TREE; >> def = SSA_NAME_DEF_STMT (use); >> def_bb = gimple_bb (def); >> >> as I said, moving this (now more expensive check) after >> >> if (def_bb >> && flow_bb_inside_loop_p (loop, def_bb)) >> return NULL_TREE; >> >> this cheap check would be better. It should avoid 99% of all calls I bet. > > > Done. > >> >> You can recover the loop-unswitch-1.c case by passing down >> the using stmt and checking its BB against loop_header (the only >> block that we trivially know is always executed when entering the region). >> Or do that check in the caller, like >> >> if (bb != loop->header >> && ssa_undefined_value_p (use, true) / >> defined_or_undefined->is_maybe_undefined (use)) > > > Done in callee. > >> >> + gimple *def = SSA_NAME_DEF_STMT (t); >> + >> + /* Check that all the PHI args are fully defined. */ >> + if (gphi *phi = dyn_cast <gphi *> (def)) >> + { >> + if (virtual_operand_p (PHI_RESULT (phi))) >> + continue; >> >> You should never run into virtual operands (you only walk SSA_OP_USEs). > > > Done. > >> >> You can stop walking at stmts that dominate the region header, >> like with >> >> + gimple *def = SSA_NAME_DEF_STMT (t); >> /* Uses in stmts always executed when the region header >> executes are fine. */ >> if (dominated_by_p (CDI_DOMINATORS, loop_header, gimple_bb (def))) >> continue; > > > Hmmmm... doing this causes the PR testcase (gcc.dg/loop-unswitch-5.c in the > attached patch to FAIL). I haven't looked at it, but I seem to recall in > the testcase that we could have a DEF that dominated the loop but was a mess > of PHI's, some of whose args were undefined. > > Did you perhaps want to put that dominated_by_p call after the PHI arg > checks (which seems to work)?:
Ah, yes - of course. PHIs are not really "defs". > /* Check that all the PHI args are fully defined. */ > if (gphi *phi = dyn_cast <gphi *> (def)) > ... > ... > ... > > + /* Uses in stmts always executed when the region header executes > + are fine. */ > + if (dominated_by_p (CDI_DOMINATORS, loop->header, gimple_bb (def))) > + continue; > + > /* Handle calls and memory loads conservatively. */ > if (!is_gimple_assign (def) > || (gimple_assign_single_p (def) > > Until this is clear, I've left this dominated_by_p call #if 0'ed out. > >> >> and the bail out for PARM_DECLs is wrong: >> >> + /* A PARM_DECL will not have an SSA_NAME_DEF_STMT. Parameters >> + get their initial value from function entry. */ >> + if (SSA_NAME_VAR (t) && TREE_CODE (SSA_NAME_VAR (t)) == PARM_DECL) >> + return false; >> >> needs to be a continue; rather than a return false. > > > Done. > > Preliminary test show the attached patch works. Further tests on-going. Looks good now, with the #if 0 dominance check moved after the PHI handling. Thus, ok for trunk if the fixed variant passes testing. Thanks, Richard. > Aldy