On Fri, May 24, 2019 at 11:15 AM Feng Xue OS <f...@os.amperecomputing.com> wrote: > > This version is based on the proposal of Richard. And fix a bug on OpenACC > loop when this opt is turned on. > Also add some test cases
Comments below. > Feng > ----- > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index 9f0f889..d1c1e3a 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,16 @@ > +2019-05-23 Feng Xue <f...@os.amperecomputing.com> > + > + PR tree-optimization/89713 > + * doc/invoke.texi (-ffinite-loop): Document new option. > + * common.opt (-ffinite-loop): New option. > + * tree-ssa-dce.c (loop_has_true_exits): New function. > + (mark_stmt_if_obviously_necessary): Mark IFN_GOACC_LOOP > + calls as neccessary. > + (find_obviously_necessary_stmts): Use flag to control > + removal of a loop with assumed finiteness. > + (eliminate_unnecessary_stmts): Do not delete dead result > + of IFN_GOACC_LOOP calls. > + > 2019-05-22 David Malcolm <dmalc...@redhat.com> > > PR c++/90462 > diff --git a/gcc/common.opt b/gcc/common.opt > index d342c4f..e98a34d 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -1437,6 +1437,10 @@ ffinite-math-only > Common Report Var(flag_finite_math_only) Optimization SetByCombined > Assume no NaNs or infinities are generated. > > +ffinite-loop I think it should be -ffinite-loops (plural) > +Common Report Var(flag_finite_loop) Optimization > +Assume loops are finite if can not be analytically determined. This is a too vague description. I'd rather write 'Assume that loops with an exit will terminate and not loop indefinitely.' > + > ffixed- > Common Joined RejectNegative Var(common_deferred_options) Defer > -ffixed-<register> Mark <register> as being unavailable to the compiler. > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 6c89843..caa0852 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -412,6 +412,7 @@ Objective-C and Objective-C++ Dialects}. > -fdevirtualize-at-ltrans -fdse @gol > -fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects > @gol > -ffast-math -ffinite-math-only -ffloat-store > -fexcess-precision=@var{style} @gol > +-ffinite-loop @gol > -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol > -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol > -fgcse-sm -fhoist-adjacent-loads -fif-conversion @gol > @@ -9501,6 +9502,20 @@ that may set @code{errno} but are otherwise free of > side effects. This flag is > enabled by default at @option{-O2} and higher if @option{-Os} is not also > specified. > > +@item -ffinite-loop > +@opindex ffinite-loop > +@opindex fno-finite-loop > +Allow the compiler to assume that if finiteness of a loop can not be > +analytically determined, the loop must be finite. With the assumption, some > +aggressive transformation could be possible, such as removal of this kind > +of empty loop by dead code elimination (DCE). "Assume that a loop with an exit will eventually take the exit and not loop indefinitely. This allows the compiler to remove loops that otherwise have no side-effects, not considering eventual endless looping as such." > +This option is not turned on by any @option{-O} option since it might result > +in incorrect behaviour for programs that contain seemly finite, but actually > +infinite loop. I think we should turn this option on by default, document that and note that some languages (C++) say loops terminate. > + > +The default is @option{-fno-finite-loop}. > + > @item -ftree-dominator-opts > @opindex ftree-dominator-opts > Perform a variety of simple scalar cleanups (constant/copy > diff --git a/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C > b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C > new file mode 100644 > index 0000000..e374155 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C > @@ -0,0 +1,33 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */ > + > +#include <string> > +#include <vector> > +#include <list> > +#include <set> > +#include <map> > + > +using namespace std; > + > +int foo (vector<string> &v, list<string> &l, set<string> &s, map<int, > string> &m) > +{ > + for (vector<string>::iterator it = v.begin (); it != v.end (); ++it) > + it->length(); > + > + for (list<string>::iterator it = l.begin (); it != l.end (); ++it) > + it->length(); > + > + for (map<int, string>::iterator it = m.begin (); it != m.end (); ++it) > + it->first + it->second.length(); > + > + for (set<string>::iterator it0 = s.begin (); it0 != s.end(); ++it0) > + for (vector<string>::reverse_iterator it1 = v.rbegin(); it1 != v.rend(); > ++it1) > + { > + it0->length(); > + it1->length(); > + } > + > + return 0; > +} > +/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */ > + > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c > b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c > new file mode 100644 > index 0000000..ffca49c > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c > @@ -0,0 +1,37 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-cddce1 -ffinite-loop" } */ > + > +typedef struct list { > + char pad[15]; > + struct list *next; > +} list; > + > +int data; > + > +list *head, *tail; > + > +int __attribute__((pure)) pfn (int); > + > +int foo (unsigned u, int s) > +{ > + unsigned i; > + list *p; > + int j; > + > + for (i = 0; i < u; i += 2) > + ; > + > + for (p = head; p; p = p->next) > + ; > + > + for (j = data; j & s; j = pfn (j + 3)) > + ; > + > + for (p = head; p != tail; p = p->next) > + for (j = data + 1; j > s; j = pfn (j + 2)) > + ; > + > + return 0; > +} > +/* { dg-final { scan-tree-dump-not "if" "cddce1"} } */ > + > diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c > index 2478219..bb143a3 100644 > --- a/gcc/tree-ssa-dce.c > +++ b/gcc/tree-ssa-dce.c > @@ -245,6 +245,18 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool > aggressive) > mark_stmt_necessary (stmt, true); > return; > } > + /* IFN_GOACC_LOOP calls are neccessary in that they are used to > + represent parameter (i.e. step, bound) of a lowered OpenACC > + partitioned loop. But this kind of partitioned loop might not > + survive from aggressive loop removal for it has loop exit and > + is assumed to be finite. Therefore, we need to explictly mark > + these calls. (An example is libgomp.oacc-c-c++-common/pr84955.c) > */ > + if (gimple_call_internal_p (stmt) > + && gimple_call_internal_fn (stmt) == IFN_GOACC_LOOP) > + { > + mark_stmt_necessary (stmt, true); > + return; > + } Thomas - this looks like an artifact of "bad" testcases like the cited one? > if (!gimple_call_lhs (stmt)) > return; > break; > @@ -357,6 +369,28 @@ mark_control_dependent_edges_necessary (basic_block bb, > bool ignore_self) > } > > > +/* Check whether a loop has any non-EH exit. */ > + > +static bool > +loop_has_true_exits (const struct loop *loop) > +{ > + vec<edge> exits = get_loop_exit_edges (loop); > + bool found = false; > + edge e; > + unsigned i; > + > + FOR_EACH_VEC_ELT (exits, i, e) > + if (!(e->flags & EDGE_EH)) > + { > + found = true; > + break; > + } > + > + exits.release (); > + return found; > +} > + > + > /* Find obviously necessary statements. These are things like most function > calls, and stores to file level variables. > > @@ -417,8 +451,15 @@ find_obviously_necessary_stmts (bool aggressive) > } > > FOR_EACH_LOOP (loop, 0) > - if (!finite_loop_p (loop)) > + if (!finite_loop_p (loop)) > { > + if (flag_finite_loop && loop_has_true_exits (loop)) finite_loop_p is a predicate that should honor flag_finite_loop so please adjust that instead. Simply inline loop_has_true_exits there at the end with a comment explaining it. > + { > + if (dump_file) > + fprintf (dump_file, "assume loop %i to be finite\n", > loop->num); > + continue; > + } > + > if (dump_file) > fprintf (dump_file, "cannot prove finiteness of loop %i\n", > loop->num); > mark_control_dependent_edges_necessary (loop->latch, false); > @@ -1313,7 +1354,12 @@ eliminate_unnecessary_stmts (void) > && DECL_FUNCTION_CODE (call) != BUILT_IN_MALLOC > && DECL_FUNCTION_CODE (call) != BUILT_IN_CALLOC > && !ALLOCA_FUNCTION_CODE_P > - (DECL_FUNCTION_CODE (call))))) > + (DECL_FUNCTION_CODE (call)))) > + /* Avoid doing so for OpenACC abstraction calls > + (IFN_GOACC_LOOP), because later pass that lowers those > + calls need to access lhs of calls. */ > + && (!gimple_call_internal_p (stmt) > + || gimple_call_internal_fn (stmt) != IFN_GOACC_LOOP)) You can use gimple_call_internal_p (stmt, IFN_GOACC_LOOP) Thomas? This part looks OK to me but it seems lowering could deal with this as well? Thanks for working on this. Richard. > { > something_changed = true; > if (dump_file && (dump_flags & TDF_DETAILS)) > diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c > b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c > new file mode 100644 > index 0000000..845268b > --- /dev/null > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c > @@ -0,0 +1,31 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */ > + > +int > +f1 (void) > +{ > + int i, j; > + > +#pragma acc parallel loop tile(2,3) > + for (i = 1; i < 10; i++) > + for (j = 1; j < 10; j++) > + for (;;) > + ; > + > + return i + j; > +} > + > +int > +f2 (void) > +{ > + int i, j, k; > + > +#pragma acc parallel loop tile(2,3) > + for (i = 1; i < 10; i++) > + for (j = 1; j < 10; j++) > + for (k = 1; k < 10; k++) > + ; > + > + return i + j; > +} > +/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */