On Sat, Mar 1, 2014 at 2:19 AM, Richard Biener <richard.guent...@gmail.com> wrote: > On Fri, Feb 28, 2014 at 7:23 PM, H.J. Lu <hjl.to...@gmail.com> wrote: >> On Fri, Feb 28, 2014 at 9:42 AM, H.J. Lu <hjl.to...@gmail.com> wrote: >>> On Fri, Feb 28, 2014 at 9:25 AM, H.J. Lu <hjl.to...@gmail.com> wrote: >>>> On Fri, Feb 28, 2014 at 8:11 AM, H.J. Lu <hjl.to...@gmail.com> wrote: >>>>> On Fri, Feb 28, 2014 at 2:09 AM, Richard Biener >>>>> <richard.guent...@gmail.com> wrote: >>>>>> On Fri, Feb 28, 2014 at 10:09 AM, Richard Biener >>>>>> <richard.guent...@gmail.com> wrote: >>>>>>> On Fri, Feb 28, 2014 at 1:52 AM, H.J. Lu <hjl.to...@gmail.com> wrote: >>>>>>>> On Mon, Feb 24, 2014 at 9:12 PM, bin.cheng <bin.ch...@arm.com> wrote: >>>>>>>>> Hi, >>>>>>>>> This patch is to fix regression reported in PR60280 by removing >>>>>>>>> forward loop >>>>>>>>> headers/latches in cfg cleanup if possible. Several tests are broken >>>>>>>>> by >>>>>>>>> this change since cfg cleanup is shared by all optimizers. Some >>>>>>>>> tests has >>>>>>>>> already been fixed by recent patches, I went through and fixed the >>>>>>>>> others. >>>>>>>>> One case needs to be clarified is "gcc.dg/tree-prof/update-loopch.c". >>>>>>>>> When >>>>>>>>> GCC removing a basic block, it checks profile information by calling >>>>>>>>> check_bb_profile after redirecting incoming edges of the bb. This >>>>>>>>> certainly >>>>>>>>> results in warnings about invalid profile information and causes the >>>>>>>>> case to >>>>>>>>> fail. I will send a patch to skip checking profile information for a >>>>>>>>> removing basic block in stage 1 if it sounds reasonable. For now I >>>>>>>>> just >>>>>>>>> twisted the case itself. >>>>>>>>> >>>>>>>>> Bootstrap and tested on x86_64 and arm_a15. >>>>>>>>> >>>>>>>>> Is it OK? >>>>>>>>> >>>>>>>>> >>>>>>>>> 2014-02-25 Bin Cheng <bin.ch...@arm.com> >>>>>>>>> >>>>>>>>> PR target/60280 >>>>>>>>> * tree-cfgcleanup.c (tree_forwarder_block_p): Protect loop >>>>>>>>> preheaders and latches only if requested. Fix latch if it >>>>>>>>> is removed. >>>>>>>>> * tree-ssa-dom.c (tree_ssa_dominator_optimize): Set >>>>>>>>> LOOPS_HAVE_PREHEADERS. >>>>>>>>> >>>>>>>> >>>>>>>> This change: >>>>>>>> >>>>>>>> if (dest->loop_father->header == dest) >>>>>>>> - return false; >>>>>>>> + { >>>>>>>> + if (loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS) >>>>>>>> + && bb->loop_father->header != dest) >>>>>>>> + return false; >>>>>>>> + >>>>>>>> + if (loops_state_satisfies_p (LOOPS_HAVE_SIMPLE_LATCHES) >>>>>>>> + && bb->loop_father->header == dest) >>>>>>>> + return false; >>>>>>>> + } >>>>>>>> } >>>>>>>> >>>>>>>> miscompiled 435.gromacs in SPEC CPU 2006 on x32 with >>>>>>>> >>>>>>>> -O3 -funroll-loops -ffast-math -fwhole-program -flto=jobserver >>>>>>>> -fuse-linker-plugin >>>>>>>> >>>>>>>> This patch changes loops without LOOPS_HAVE_PREHEADERS >>>>>>>> nor LOOPS_HAVE_SIMPLE_LATCHES from returning false to returning >>>>>>>> true. I don't have a small testcase. But this patch: >>>>>>>> >>>>>>>> diff --git a/gcc/tree-cfgcleanup.c b/gcc/tree-cfgcleanup.c >>>>>>>> index b5c384b..2ba673c 100644 >>>>>>>> --- a/gcc/tree-cfgcleanup.c >>>>>>>> +++ b/gcc/tree-cfgcleanup.c >>>>>>>> @@ -323,6 +323,10 @@ tree_forwarder_block_p (basic_block bb, bool >>>>>>>> phi_wanted) >>>>>>>> if (loops_state_satisfies_p (LOOPS_HAVE_SIMPLE_LATCHES) >>>>>>>> && bb->loop_father->header == dest) >>>>>>>> return false; >>>>>>>> + >>>>>>>> + if (!loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS) >>>>>>>> + && !loops_state_satisfies_p (LOOPS_HAVE_SIMPLE_LATCHES)) >>>>>>>> + return false; >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> fixes the regression. Does it make any senses? >>>>>>> >>>>>>> I think the preheader test isn't fully correct (bb may be in an inner >>>>>>> loop >>>>>>> for example). So a more conservative variant would be >>>>>>> >>>>>>> Index: gcc/tree-cfgcleanup.c >>>>>>> =================================================================== >>>>>>> --- gcc/tree-cfgcleanup.c (revision 208169) >>>>>>> +++ gcc/tree-cfgcleanup.c (working copy) >>>>>>> @@ -316,13 +316,13 @@ tree_forwarder_block_p (basic_block bb, >>>>>>> /* Protect loop preheaders and latches if requested. */ >>>>>>> if (dest->loop_father->header == dest) >>>>>>> { >>>>>>> - if (loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS) >>>>>>> - && bb->loop_father->header != dest) >>>>>>> - return false; >>>>>>> - >>>>>>> - if (loops_state_satisfies_p (LOOPS_HAVE_SIMPLE_LATCHES) >>>>>>> - && bb->loop_father->header == dest) >>>>>>> - return false; >>>>>>> + if (bb->loop_father == dest->loop_father) >>>>>>> + return !loops_state_satisfies_p (LOOPS_HAVE_SIMPLE_LATCHES); >>>>>>> + else if (bb->loop_father == loop_outer (dest->loop_father)) >>>>>>> + return !loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS); >>>>>>> + /* Always preserve other edges into loop headers that are >>>>>>> + not simple latches or preheaders. */ >>>>>>> + return false; >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> that makes sure we can properly update loop information. It's also >>>>>>> a more conservative change at this point which should still successfully >>>>>>> remove simple latches and preheaders created by loop discovery. >>>>>> >>>>>> I think the patch makes sense anyway and thus I'll install it once it >>>>>> passed bootstrap / regtesting. >>>>>> >>>>>> Another fix that may make sense is to restrict it to >>>>>> !loops_state_satisfies_p (LOOPS_NEED_FIXUP), though cfgcleanup >>>>>> itself can end up setting that ... which we eventually should fix if it >>>>>> still happens. That is, check if >>>>>> >>>>>> Index: gcc/tree-cfgcleanup.c >>>>>> =================================================================== >>>>>> --- gcc/tree-cfgcleanup.c (revision 208169) >>>>>> +++ gcc/tree-cfgcleanup.c (working copy) >>>>>> >>>>>> @@ -729,8 +729,9 @@ cleanup_tree_cfg_noloop (void) >>>>>> >>>>>> timevar_pop (TV_TREE_CLEANUP_CFG); >>>>>> >>>>>> - if (changed && current_loops) >>>>>> - loops_state_set (LOOPS_NEED_FIXUP); >>>>>> + if (changed && current_loops >>>>>> + && !loops_state_satisfies_p (LOOPS_NEED_FIXUP)) >>>>>> + verify_loop_structure (); >>>>>> >>>>>> return changed; >>>>>> } >>>>>> >>>>>> trips anywhere (and apply fixes). That's of course not appropriate at >>>>>> this stage. >>>>>> >>>>>>> Does it fix 435.gromacs? >>>>> >>>>> I tried revision 208222 and it doesn't fix 435.gromacs. >>>> >>>> Remove >>>> >>>> else if (bb->loop_father == loop_outer (dest->loop_father)) >>>> return !loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS); >>> >>> Should we also check other loop state, like LOOPS_HAVE_SIMPLE_LATCHES >>> or LOOPS_MAY_HAVE_MULTIPLE_LATCHES here? >>> >> >> This patch: >> >> diff --git a/gcc/tree-cfgcleanup.c b/gcc/tree-cfgcleanup.c >> index 926d300..fb1c63d 100644 >> --- a/gcc/tree-cfgcleanup.c >> +++ b/gcc/tree-cfgcleanup.c >> @@ -328,7 +328,9 @@ tree_forwarder_block_p (basic_block bb, bool phi_wanted) >> (LOOPS_MAY_HAVE_MULTIPLE_LATCHES)); >> } >> else if (bb->loop_father == loop_outer (dest->loop_father)) >> - return !loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS); >> + return (!loops_state_satisfies_p (LOOPS_HAVE_PREHEADERS) >> + && !loops_state_satisfies_p >> + (LOOPS_MAY_HAVE_MULTIPLE_LATCHES)); >> /* Always preserve other edges into loop headers that are >> not simple latches or preheaders. */ >> return false; >> >> works. Does it look right? > > No. Latches and pre-headers are independent. The above effectively > disables removing preheaders. > > Unfortunately I don't have a x32 runtime environment to reproduce the > issue. > > So, can you narrow it down by bisecting the actual preheader removal? > Like, add a static counter for the number of times returned true above > and return false if it is between getenv("from") and getenv("to") and > then narrow down from and to? Then look at the actual transform > done if it hopefully narrows down to a single removal ... (should be > enough to do that for the ltrans stage, so eventually you can narrow > it down to a specific ltrans unit first).
I opened: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60418 -- H.J.