As shown in PR84955, GCC has problems empty infinite loops of the form #pragma acc loop tile (2, 2) for (...) for (...) { for (...) /* do nothing */ ; }
I also observed that it generates bad code for the following loop #pragma acc loop for (...) { for (...) /* do nothing */ ; } There are two problems here. First, for the tiled loop, as expand_oacc_for splits the first basic block of the loop body to add a branch to perform the tiling control logic, it does not consider fact that the body may not have an exit when it contains an infinite loop. This situation can occur when region->cont is NULL. The fix here was to add a dummy false edge to exit_bb. The second problem involves fixing up the CFG. Unlike OpenMP which defers a lot of the scheduling control to libgomp, OpenACC makes heavy use of calls to internal functions. The problem here is that the fixup_cfg pass was largely ignoring calls to internal functions when it decides to set PROP_cleanup_cfg. When the CFG isn't cleaned up, the LTO streamer will report that there are fewer loops than there actually are and that causes an ICE because the empty infinite loop is never accounted for. This patch resolves this issue by relaxing the fixup_cfg pass to treat all functions calls, including those to internal functions, the same. An alternative approach to resolve this issue would have been to teach ipa_write_summaries to check if the loop structure needs to be fixed up and call cleanup_cfg as necessary. But I wanted to keep the OMP and OACC code paths similar, so I took the former approach. I regression tested this patch on x86_64-linux using nvptx offloading. Is this patch OK for trunk and GCC 7 (and probably GCC 6). Thanks, Cesar
Fix PR84955 2018-04-06 Cesar Philippidis <ce...@codesourcery.com> PR middle-end/84955 gcc/ * cfgloop.c (flow_loops_find): Add assert. * omp-expand.c (expand_oacc_for): Add dummy false branch for tiled basic blocks without omp continue statements. * tree-cfg.c (execute_fixup_cfg): Handle calls to internal functions like regular functions. libgomp/ * testsuite/libgomp.oacc-c-c++-common/pr84955.c: New test. * testsuite/libgomp.oacc-fortran/pr84955.f90: New test. diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c index 8af793c6015..6e68639452c 100644 --- a/gcc/cfgloop.c +++ b/gcc/cfgloop.c @@ -462,6 +462,9 @@ flow_loops_find (struct loops *loops) { struct loop *loop; + if (!from_scratch) + gcc_assert (header->loop_father != NULL); + /* The current active loop tree has valid loop-fathers for header blocks. */ if (!from_scratch diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c index bb204906ea6..1c7b68fbd8c 100644 --- a/gcc/omp-expand.c +++ b/gcc/omp-expand.c @@ -5439,6 +5439,13 @@ expand_oacc_for (struct omp_region *region, struct omp_for_data *fd) split->flags ^= EDGE_FALLTHRU | EDGE_TRUE_VALUE; + /* Add a dummy exit for the tiled block when cont_bb is missing. */ + if (cont_bb == NULL) + { + edge e = make_edge (body_bb, exit_bb, EDGE_FALSE_VALUE); + e->probability = profile_probability::even (); + } + /* Initialize the user's loop vars. */ gsi = gsi_start_bb (elem_body_bb); expand_oacc_collapse_vars (fd, true, &gsi, counts, e_offset); diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index 9485f73f341..cb676d78128 100644 --- a/gcc/tree-cfg.c +++ b/gcc/tree-cfg.c @@ -9586,10 +9586,7 @@ execute_fixup_cfg (void) for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);) { gimple *stmt = gsi_stmt (gsi); - tree decl = is_gimple_call (stmt) - ? gimple_call_fndecl (stmt) - : NULL; - if (decl) + if (is_gimple_call (stmt)) { int flags = gimple_call_flags (stmt); if (flags & (ECF_CONST | ECF_PURE | ECF_LOOPING_CONST_OR_PURE)) diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955.c new file mode 100644 index 00000000000..5910b57b68d --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ + +int +main () +{ + int i, j; + +#pragma acc parallel loop tile(2,3) + for (i = 1; i < 10; i++) + for (j = 1; j < 10; j++) + for (;;) + ; + +#pragma acc parallel loop + for (i = 1; i < 10; i++) + for (;;) + ; + + return i + j; +} diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr84955.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr84955.f90 new file mode 100644 index 00000000000..878d8a89f41 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/pr84955.f90 @@ -0,0 +1,20 @@ +! { dg-do compile } + +subroutine s + integer :: i, j + !$acc parallel loop tile(2,3) + do i = 1, 10 + do j = 1, 10 + do + end do + end do + end do + !$acc end parallel loop + + !$acc parallel loop + do i = 1, 10 + do + end do + end do + !$acc end parallel loop +end subroutine s