On Tue, Mar 5, 2019 at 11:44 AM Richard Biener <richard.guent...@gmail.com> wrote: > > On Tue, Mar 5, 2019 at 10:48 AM Richard Biener > <richard.guent...@gmail.com> wrote: > > > > On Mon, Mar 4, 2019 at 11:01 PM Qing Zhao <qing.z...@oracle.com> wrote: > > > > > > Hi, Richard, > > > > > > > On Mar 4, 2019, at 5:45 AM, Richard Biener <richard.guent...@gmail.com> > > > > wrote: > > > >> > > > >> It looks like DOM fails to visit stmts generated by simplification. > > > >> Can you open a bug report with a testcase? > > > >> > > > >> > > > >> The problem is, It took me quite some time in order to come up with a > > > >> small and independent testcase for this problem, > > > >> a little bit change made the error disappear. > > > >> > > > >> do you have any suggestion on this? or can you give me some hint on > > > >> how to fix this in DOM? then I can try the fix on my side? > > > > > > > > I remember running into similar issues in the past where I tried to > > > > extract temporary nonnull ranges from divisions. > > > > I have there > > > > > > > > @@ -1436,11 +1436,16 @@ dom_opt_dom_walker::before_dom_children > > > > m_avail_exprs_stack->pop_to_marker (); > > > > > > > > edge taken_edge = NULL; > > > > - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > > > > - { > > > > - evrp_range_analyzer.record_ranges_from_stmt (gsi_stmt (gsi), > > > > false); > > > > - taken_edge = this->optimize_stmt (bb, gsi); > > > > - } > > > > + gsi = gsi_start_bb (bb); > > > > + if (!gsi_end_p (gsi)) > > > > + while (1) > > > > + { > > > > + evrp_range_analyzer.record_def_ranges_from_stmt (gsi_stmt > > > > (gsi), false); > > > > + taken_edge = this->optimize_stmt (bb, &gsi); > > > > + if (gsi_end_p (gsi)) > > > > + break; > > > > + evrp_range_analyzer.record_use_ranges_from_stmt (gsi_stmt > > > > (gsi)); > > > > + } > > > > > > > > /* Now prepare to process dominated blocks. */ > > > > record_edge_info (bb); > > > > > > > > OTOH the issue in your case is that fold emits new stmts before gsi but > > > > the > > > > above loop will never look at them. See tree-ssa-forwprop.c for code > > > > how > > > > to deal with this (setting a pass-local flag on stmts visited and > > > > walking back > > > > to unvisited, newly inserted ones). The fold_stmt interface could in > > > > theory > > > > also be extended to insert new stmts on a sequence passed to it so the > > > > caller would be responsible for inserting them into the IL and could > > > > then > > > > more easily revisit them (but that's a bigger task). > > > > > > > > So, does the following help? > > > > > > Yes, this change fixed the error in my side, now, in the dumped file for > > > pass dom3: > > > > > > ==== > > > Visiting statement: > > > i_49 = _98 > 0 ? k_105 : 0; > > > Meeting > > > [0, 65535] > > > and > > > [0, 0] > > > to > > > [0, 65535] > > > Intersecting > > > [0, 65535] > > > and > > > [0, 65535] > > > to > > > [0, 65535] > > > Optimizing statement i_49 = _98 > 0 ? k_105 : 0; > > > Replaced 'k_105' with variable '_98' > > > gimple_simplified to _152 = MAX_EXPR <_98, 0>; > > > i_49 = _152; > > > > Ah, that looks interesting. From this detail we might be > > able to derive a testcase as well - a GIMPLE one > > eventually because DOM runs quite late. It's also interesting > > to see the inefficient code here (the extra copy), probably > > some known issue with match-and-simplify, I'd have to check. > > > > > Folded to: i_49 = _152; > > > LKUP STMT i_49 = _152 > > > ==== ASGN i_49 = _152 > > > > > > Visiting statement: > > > _152 = MAX_EXPR <_98, 0>; > > > > > > Visiting statement: > > > i_49 = _152; > > > Intersecting > > > [0, 65535] EQUIVALENCES: { _152 } (1 elements) > > > and > > > [0, 65535] > > > to > > > [0, 65535] EQUIVALENCES: { _152 } (1 elements) > > > ==== > > > > > > We can clearly see from the above, all the new stmts generated by fold > > > are visited now. > > > > We can also see that DOMs optimize_stmt code is not executed on the first > > stmt > > of the folding result (the MAX_EXPR), so the fix can be probably > > amended/simplified > > with that in mind. > > > > > it is also confirmed that the runtime error caused by this bug was gone > > > with this fix. > > > > > > So, what’s the next step for this issue? > > > > > > will you commit this fix to gcc9 and gcc8 (we need it in gcc8)? > > > > I'll see to carve out some cycles trying to find a testcase and amend > > the fix a bit > > and will take care of testing/submitting the fix. Thanks for testing > > that it works > > for your case. > > I filed PR89595 with a testcase.
So fixing it properly with also re-optimize_stmt those stmts so we'd CSE the MAX_EXPR introduced by folding makes it somewhat ugly. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Any ideas how to make it less so? I can split out making optimize_stmt take a gsi * btw, in case that's a more obvious change and it makes the patch a little smaller. Richard. 2019-03-05 Richard Biener <rguent...@suse.de> PR tree-optimization/89595 * tree-ssa-dom.c (dom_opt_dom_walker::optimize_stmt): Take stmt iterator as reference, take boolean output parameter to indicate whether the stmt was removed and thus the iterator already advanced. (dom_opt_dom_walker::before_dom_children): Re-iterate over stmts created by folding. * gcc.dg/torture/pr89595.c: New testcase.
fix-pr89595
Description: Binary data