Hi! On Wed, 1 Jun 2016 17:12:17 +0200, Jakub Jelinek <ja...@redhat.com> wrote: > On Wed, Jun 01, 2016 at 05:06:42PM +0200, Thomas Schwinge wrote: > > Here are the OpenACC bits of <http://gcc.gnu.org/PR71373>. > > > > As we're currently not paying attention to OpenACC tile clauses in the > > middle end, and thus OMP_CLAUSE_TILE's arguments are not to be considered > > stable, I opted to simply discard them early, simplifying their > > gcc/tree-nested.c handling. Everything else should be self-explanatory. > > > > OK for trunk and gcc-6-branch? > > LGTM for both, but please as a follow-up try to work also on a C testcase > with nested functions that covers all the clauses (both referencing > vars/expressions that are defined in the current function and used by a > nested function, and for vars/expressions that are defined in parent > function and used in clauses inside of nested function.
OK, I translated gcc/testsuite/gfortran.dg/goacc/subroutines.f90 from Fortran to C: gcc/testsuite/gcc.dg/goacc/nested.c. For amusement ;-) I'm also including the test case that originally made us aware of the problem, gcc/testsuite/gcc.dg/goacc/pr71373.c. Oh, and I just remembered <http://news.gmane.org/find-root.php?message_id=%3C5459732B.1010101%40codesourcery.com%3E>, so after re-testing, I'll also include these test cases, as far as still relevant. Nested function decomposition is not applicable to C++, so we don't need any C++ test cases, right? During the translation of gcc/testsuite/gfortran.dg/goacc/subroutines.f90 from Fortran to C, I stumbled upon <https://gcc.gnu.org/PR71381> "C/C++ OpenACC cache directive rejects valid syntax", <http://news.gmane.org/find-root.php?message_id=%3C877fe7sthf.fsf%40kepler.schwinge.homeip.net%3E>, so that one will need to go in first, before I'll then commit the following: commit 7eff9da0e8fe5eda7d76b9a27dbb1ec4e6183844 Author: Thomas Schwinge <tho...@codesourcery.com> Date: Wed Jun 1 17:01:35 2016 +0200 [PR middle-end/71373] Handle more OMP_CLAUSE_* in nested function decomposition gcc/ * gimplify.c (gimplify_adjust_omp_clauses): Discard OMP_CLAUSE_TILE. * omp-low.c (scan_sharing_clauses): Don't expect OMP_CLAUSE_TILE. gcc/testsuite/ * c-c++-common/goacc/combined-directives.c: XFAIL tree scanning for OpenACC tile clauses. * gfortran.dg/goacc/combined-directives.f90: Likewise. gcc/ PR middle-end/71373 * tree-nested.c (convert_nonlocal_omp_clauses) (convert_local_omp_clauses): Handle OMP_CLAUSE_ASYNC, OMP_CLAUSE_WAIT, OMP_CLAUSE_INDEPENDENT, OMP_CLAUSE_AUTO, OMP_CLAUSE__CACHE_, OMP_CLAUSE_TILE. gcc/testsuite/ PR middle-end/71373 * gcc.dg/goacc/nested.c: New file. * gcc.dg/goacc/pr71373.c: Likewise. * gfortran.dg/goacc/subroutines.f90: Update. --- gcc/gimplify.c | 6 ++ gcc/omp-low.c | 4 +- .../c-c++-common/goacc/combined-directives.c | 3 +- gcc/testsuite/gcc.dg/goacc/nested.c | 100 +++++++++++++++++++++ gcc/testsuite/gcc.dg/goacc/pr71373.c | 41 +++++++++ .../gfortran.dg/goacc/combined-directives.f90 | 3 +- gcc/testsuite/gfortran.dg/goacc/subroutines.f90 | 56 ++++++++---- gcc/tree-nested.c | 30 +++++++ 8 files changed, 221 insertions(+), 22 deletions(-) diff --git gcc/gimplify.c gcc/gimplify.c index f12c6a1..7c19cf3 100644 --- gcc/gimplify.c +++ gcc/gimplify.c @@ -8280,7 +8280,13 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p, case OMP_CLAUSE_VECTOR: case OMP_CLAUSE_AUTO: case OMP_CLAUSE_SEQ: + break; + case OMP_CLAUSE_TILE: + /* We're not yet making use of the information provided by OpenACC + tile clauses. Discard these here, to simplify later middle end + processing. */ + remove = true; break; default: diff --git gcc/omp-low.c gcc/omp-low.c index 91d5fcf..c6ba31c 100644 --- gcc/omp-low.c +++ gcc/omp-low.c @@ -2187,7 +2187,6 @@ scan_sharing_clauses (tree clauses, omp_context *ctx, case OMP_CLAUSE_GANG: case OMP_CLAUSE_WORKER: case OMP_CLAUSE_VECTOR: - case OMP_CLAUSE_TILE: case OMP_CLAUSE_INDEPENDENT: case OMP_CLAUSE_AUTO: case OMP_CLAUSE_SEQ: @@ -2201,6 +2200,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx, break; case OMP_CLAUSE__CACHE_: + case OMP_CLAUSE_TILE: default: gcc_unreachable (); } @@ -2357,7 +2357,6 @@ scan_sharing_clauses (tree clauses, omp_context *ctx, case OMP_CLAUSE_GANG: case OMP_CLAUSE_WORKER: case OMP_CLAUSE_VECTOR: - case OMP_CLAUSE_TILE: case OMP_CLAUSE_INDEPENDENT: case OMP_CLAUSE_AUTO: case OMP_CLAUSE_SEQ: @@ -2365,6 +2364,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx, break; case OMP_CLAUSE__CACHE_: + case OMP_CLAUSE_TILE: default: gcc_unreachable (); } diff --git gcc/testsuite/c-c++-common/goacc/combined-directives.c gcc/testsuite/c-c++-common/goacc/combined-directives.c index c2a3c57..3fa800d 100644 --- gcc/testsuite/c-c++-common/goacc/combined-directives.c +++ gcc/testsuite/c-c++-common/goacc/combined-directives.c @@ -111,6 +111,7 @@ test () // { dg-final { scan-tree-dump-times "acc loop vector" 2 "gimple" } } // { dg-final { scan-tree-dump-times "acc loop seq" 2 "gimple" } } // { dg-final { scan-tree-dump-times "acc loop auto" 2 "gimple" } } -// { dg-final { scan-tree-dump-times "acc loop tile.2, 3" 2 "gimple" } } +// XFAILed: OpenACC tile clauses are discarded during gimplification. +// { dg-final { scan-tree-dump-times "acc loop tile.2, 3" 2 "gimple" { xfail *-*-* } } } // { dg-final { scan-tree-dump-times "acc loop independent private.i" 2 "gimple" } } // { dg-final { scan-tree-dump-times "private.z" 2 "gimple" } } diff --git gcc/testsuite/gcc.dg/goacc/nested.c gcc/testsuite/gcc.dg/goacc/nested.c new file mode 100644 index 0000000..6e1f236 --- /dev/null +++ gcc/testsuite/gcc.dg/goacc/nested.c @@ -0,0 +1,100 @@ +/* Exercise how tree-nested.c handles OpenACC clauses. */ +/* See gcc/testsuite/gfortran.dg/goacc/subroutines.f90 for the Fortran + version. */ + +int main () +{ +#define N 100 + int nonlocal_arg; + int nonlocal_a[N]; + int nonlocal_i; + int nonlocal_j; + + for (int i = 0; i < N; ++i) + nonlocal_a[i] = 5; + nonlocal_arg = 5; + + void local () + { + int local_i; + int local_arg; + int local_a[N]; + int local_j; + + for (int i = 0; i < N; ++i) + local_a[i] = 5; + local_arg = 5; + +#pragma acc kernels loop \ + gang(num:local_arg) worker(local_arg) vector(local_arg) \ + wait async(local_arg) + for (local_i = 0; local_i < N; ++local_i) + { +#pragma acc cache (local_a[local_i:5]) + local_a[local_i] = 100; +#pragma acc loop seq tile(*) + for (local_j = 0; local_j < N; ++local_j) + ; +#pragma acc loop auto independent tile(1) + for (local_j = 0; local_j < N; ++local_j) + ; + } + +#pragma acc kernels loop \ + gang(static:local_arg) worker(local_arg) vector(local_arg) \ + wait(local_arg, local_arg + 1, local_arg + 2) async + for (local_i = 0; local_i < N; ++local_i) + { +#pragma acc cache (local_a[local_i:4]) + local_a[local_i] = 100; +#pragma acc loop seq tile(1) + for (local_j = 0; local_j < N; ++local_j) + ; +#pragma acc loop auto independent tile(*) + for (local_j = 0; local_j < N; ++local_j) + ; + } + } + + void nonlocal () + { + for (int i = 0; i < N; ++i) + nonlocal_a[i] = 5; + nonlocal_arg = 5; + +#pragma acc kernels loop \ + gang(num:nonlocal_arg) worker(nonlocal_arg) vector(nonlocal_arg) \ + wait async(nonlocal_arg) + for (nonlocal_i = 0; nonlocal_i < N; ++nonlocal_i) + { +#pragma acc cache (nonlocal_a[nonlocal_i:3]) + nonlocal_a[nonlocal_i] = 100; +#pragma acc loop seq tile(2) + for (nonlocal_j = 0; nonlocal_j < N; ++nonlocal_j) + ; +#pragma acc loop auto independent tile(3) + for (nonlocal_j = 0; nonlocal_j < N; ++nonlocal_j) + ; + } + +#pragma acc kernels loop \ + gang(static:nonlocal_arg) worker(nonlocal_arg) vector(nonlocal_arg) \ + wait(nonlocal_arg, nonlocal_arg + 1, nonlocal_arg + 2) async + for (nonlocal_i = 0; nonlocal_i < N; ++nonlocal_i) + { +#pragma acc cache (nonlocal_a[nonlocal_i:2]) + nonlocal_a[nonlocal_i] = 100; +#pragma acc loop seq tile(*) + for (nonlocal_j = 0; nonlocal_j < N; ++nonlocal_j) + ; +#pragma acc loop auto independent tile(*) + for (nonlocal_j = 0; nonlocal_j < N; ++nonlocal_j) + ; + } + } + + local (); + nonlocal (); + + return 0; +} diff --git gcc/testsuite/gcc.dg/goacc/pr71373.c gcc/testsuite/gcc.dg/goacc/pr71373.c new file mode 100644 index 0000000..9381752 --- /dev/null +++ gcc/testsuite/gcc.dg/goacc/pr71373.c @@ -0,0 +1,41 @@ +/* Unintentional nested function usage. */ +/* Due to missing right braces '}', the following functions are parsed as + nested functions. This ran into an ICE. */ + +void foo (void) +{ + #pragma acc parallel + { + #pragma acc loop independent + for (int i = 0; i < 16; i++) + ; + // Note right brace '}' commented out here. + //} +} +void bar (void) +{ +} + +// Adding right brace '}' here, to make this compile. +} + + +// ..., and the other way round: + +void BAR (void) +{ +// Note right brace '}' commented out here. +//} + +void FOO (void) +{ + #pragma acc parallel + { + #pragma acc loop independent + for (int i = 0; i < 16; i++) + ; + } +} + +// Adding right brace '}' here, to make this compile. +} diff --git gcc/testsuite/gfortran.dg/goacc/combined-directives.f90 gcc/testsuite/gfortran.dg/goacc/combined-directives.f90 index 42a447a..abb5e6b 100644 --- gcc/testsuite/gfortran.dg/goacc/combined-directives.f90 +++ gcc/testsuite/gfortran.dg/goacc/combined-directives.f90 @@ -143,7 +143,8 @@ end subroutine test ! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. vector" 2 "gimple" } } ! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. seq" 2 "gimple" } } ! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. auto" 2 "gimple" } } -! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. tile.2, 3" 2 "gimple" } } +! XFAILed: OpenACC tile clauses are discarded during gimplification. +! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. tile.2, 3" 2 "gimple" { xfail *-*-* } } } ! { dg-final { scan-tree-dump-times "acc loop private.i. independent" 2 "gimple" } } ! { dg-final { scan-tree-dump-times "private.z" 2 "gimple" } } ! { dg-final { scan-tree-dump-times "omp target oacc_\[^ \]+ map.force_tofrom:y" 2 "gimple" } } diff --git gcc/testsuite/gfortran.dg/goacc/subroutines.f90 gcc/testsuite/gfortran.dg/goacc/subroutines.f90 index 6cab798..479ef4f 100644 --- gcc/testsuite/gfortran.dg/goacc/subroutines.f90 +++ gcc/testsuite/gfortran.dg/goacc/subroutines.f90 @@ -1,6 +1,5 @@ -! Exercise how tree-nested.c handles gang, worker vector and seq. - -! { dg-do compile } +! Exercise how tree-nested.c handles OpenACC clauses. +! See gcc/testsuite/c-c++-common/goacc/nested.c for the C version. program main integer, parameter :: N = 100 @@ -8,10 +7,10 @@ program main integer :: nonlocal_a(N) integer :: nonlocal_i integer :: nonlocal_j - + nonlocal_a (:) = 5 nonlocal_arg = 5 - + call local () call nonlocal () @@ -22,24 +21,35 @@ contains integer :: local_arg integer :: local_a(N) integer :: local_j - + local_a (:) = 5 local_arg = 5 - !$acc kernels loop gang(num:local_arg) worker(local_arg) vector(local_arg) + !$acc kernels loop & + !$acc gang(num:local_arg) worker(local_arg) vector(local_arg) & + !$acc wait async(local_arg) do local_i = 1, N + !$acc cache (local_a(local_i:local_i + 5)) local_a(local_i) = 100 - !$acc loop seq + !$acc loop seq tile(*) + do local_j = 1, N + enddo + !$acc loop auto independent tile(1) do local_j = 1, N enddo enddo !$acc end kernels loop - !$acc kernels loop gang(static:local_arg) worker(local_arg) & - !$acc vector(local_arg) + !$acc kernels loop & + !$acc gang(static:local_arg) worker(local_arg) vector(local_arg) & + !$acc wait(local_arg, local_arg + 1, local_arg + 2) async do local_i = 1, N + !$acc cache (local_a(local_i:local_i + 4)) local_a(local_i) = 100 - !$acc loop seq + !$acc loop seq tile(1) + do local_j = 1, N + enddo + !$acc loop auto independent tile(*) do local_j = 1, N enddo enddo @@ -49,22 +59,32 @@ contains subroutine nonlocal () nonlocal_a (:) = 5 nonlocal_arg = 5 - - !$acc kernels loop gang(num:nonlocal_arg) worker(nonlocal_arg) & - !$acc vector(nonlocal_arg) + + !$acc kernels loop & + !$acc gang(num:nonlocal_arg) worker(nonlocal_arg) vector(nonlocal_arg) & + !$acc wait async(nonlocal_arg) do nonlocal_i = 1, N + !$acc cache (nonlocal_a(nonlocal_i:nonlocal_i + 3)) nonlocal_a(nonlocal_i) = 100 - !$acc loop seq + !$acc loop seq tile(2) + do nonlocal_j = 1, N + enddo + !$acc loop auto independent tile(3) do nonlocal_j = 1, N enddo enddo !$acc end kernels loop - !$acc kernels loop gang(static:nonlocal_arg) worker(nonlocal_arg) & - !$acc vector(nonlocal_arg) + !$acc kernels loop & + !$acc gang(static:nonlocal_arg) worker(nonlocal_arg) vector(nonlocal_arg) & + !$acc wait(nonlocal_arg, nonlocal_arg + 1, nonlocal_arg + 2) async do nonlocal_i = 1, N + !$acc cache (nonlocal_a(nonlocal_i:nonlocal_i + 2)) nonlocal_a(nonlocal_i) = 100 - !$acc loop seq + !$acc loop seq tile(*) + do nonlocal_j = 1, N + enddo + !$acc loop auto independent tile(*) do nonlocal_j = 1, N enddo enddo diff --git gcc/tree-nested.c gcc/tree-nested.c index 25a92aa..97d3c52 100644 --- gcc/tree-nested.c +++ gcc/tree-nested.c @@ -1114,6 +1114,8 @@ convert_nonlocal_omp_clauses (tree *pclauses, struct walk_stmt_info *wi) case OMP_CLAUSE_GANG: case OMP_CLAUSE_WORKER: case OMP_CLAUSE_VECTOR: + case OMP_CLAUSE_ASYNC: + case OMP_CLAUSE_WAIT: /* Several OpenACC clauses have optional arguments. Check if they are present. */ if (OMP_CLAUSE_OPERAND (clause, 0)) @@ -1197,8 +1199,21 @@ convert_nonlocal_omp_clauses (tree *pclauses, struct walk_stmt_info *wi) case OMP_CLAUSE_SIMD: case OMP_CLAUSE_DEFAULTMAP: case OMP_CLAUSE_SEQ: + case OMP_CLAUSE_INDEPENDENT: + case OMP_CLAUSE_AUTO: break; + case OMP_CLAUSE__CACHE_: + /* These clauses belong to the OpenACC cache directive, which is + discarded during gimplification, so we don't expect to see + anything here. */ + gcc_unreachable (); + + case OMP_CLAUSE_TILE: + /* OpenACC tile clauses are discarded during gimplification, so we + don't expect to see anything here. */ + gcc_unreachable (); + default: gcc_unreachable (); } @@ -1790,6 +1805,8 @@ convert_local_omp_clauses (tree *pclauses, struct walk_stmt_info *wi) case OMP_CLAUSE_GANG: case OMP_CLAUSE_WORKER: case OMP_CLAUSE_VECTOR: + case OMP_CLAUSE_ASYNC: + case OMP_CLAUSE_WAIT: /* Several OpenACC clauses have optional arguments. Check if they are present. */ if (OMP_CLAUSE_OPERAND (clause, 0)) @@ -1878,8 +1895,21 @@ convert_local_omp_clauses (tree *pclauses, struct walk_stmt_info *wi) case OMP_CLAUSE_SIMD: case OMP_CLAUSE_DEFAULTMAP: case OMP_CLAUSE_SEQ: + case OMP_CLAUSE_INDEPENDENT: + case OMP_CLAUSE_AUTO: break; + case OMP_CLAUSE__CACHE_: + /* These clauses belong to the OpenACC cache directive, which is + discarded during gimplification, so we don't expect to see + anything here. */ + gcc_unreachable (); + + case OMP_CLAUSE_TILE: + /* OpenACC tile clauses are discarded during gimplification, so we + don't expect to see anything here. */ + gcc_unreachable (); + default: gcc_unreachable (); } Grüße Thomas