[PATCH] amdgcn: Add gfx90c target

2024-04-25 Thread Frederik Harwath
809e2a0248e6fad1e8336b4a883a729017cc62e5 Mon Sep 17 00:00:00 2001 From: Frederik Harwath Date: Wed, 24 Apr 2024 20:29:14 +0200 Subject: [PATCH] amdgcn: Add gfx90c target Add support for gfx90c GCN5 APU integrated graphics devices. The LLVM AMDGPU documentation does not list those devices as

Re: [PATCH] OpenMP: warn about iteration var modifications in loop body

2024-03-06 Thread Frederik Harwath
Ping. The Linaro CI has kindly pointed me to two test regressions that I had missed. I have adjust the test expectations in the updated patch which I have attached. Frederik On 28.02.24 8:32 PM, Frederik Harwath wrote: Hi, this patch implements a warning about (some simple cases of direct

[PATCH] OpenMP: warn about iteration var modifications in loop body

2024-02-28 Thread Frederik Harwath
u and not observed any regressions. Is it ok to commit this? Best regards, Frederik From 4944a9f94bcda9907e0118e71137ee7e192657c2 Mon Sep 17 00:00:00 2001 From: Frederik Harwath Date: Tue, 27 Feb 2024 21:07:00 + Subject: [PATCH] OpenMP: warn about iteration var modifications in loop bod

[PATCH 2/4] openmp: Fix initialization for 'unroll full'

2023-07-28 Thread Frederik Harwath
The index variable initialization for the 'omp unroll' directive with 'full' clause got lost and the testsuite did not catch it. Add the initialization and add -Wall to some tests to detect uninitialized variable uses and other potential problems in the code generation. gcc/ChangeLog: *

[PATCH 3/4] openmp: Fix diagnostic message for "omp unroll"

2023-07-28 Thread Frederik Harwath
gcc/ChangeLog: * omp-transform-loops.cc (print_optimized_unroll_partial_msg): Output "omp unroll partial" instead of "omp unroll auto". (optimize_transformation_clauses): Likewise. libgomp/ChangeLog: * testsuite/libgomp.fortran/loop-transforms/unroll-6.f90: Adjust

[PATCH 4/4] openmp: Fix number of iterations computation for "omp unroll full"

2023-07-28 Thread Frederik Harwath
gcc/ChangeLog: * omp-transform-loops.cc (gomp_for_number_of_iterations): Always compute "final - init" and do not take absolute value. Identify non-iterating and infinite loops for constant init, final, step values for better diagnostic messages, consistent

[PATCH 0/4] openmp: loop transformation fixes

2023-07-28 Thread Frederik Harwath
Hi, the following patches contain some fixes from the devel/omp/gcc-13 branch to the patches that implement the OpenMP 5.1. loop transformation directives which I have posted in March 2023. Frederik Frederik Harwath (4): openmp: Fix loop transformation tests openmp: Fix initialization for

[PATCH 1/4] openmp: Fix loop transformation tests

2023-07-28 Thread Frederik Harwath
libgomp/ChangeLog: * testsuite/libgomp.fortran/loop-transforms/tile-2.f90: Add reduction clause. * testsuite/libgomp.fortran/loop-transforms/unroll-1.f90: Initialize var. * testsuite/libgomp.fortran/loop-transforms/unroll-simd-1.f90: Add reduction and initializat

Re: [PATCH 0/7] openmp: OpenMP 5.1 loop transformation directives

2023-05-17 Thread Frederik Harwath via Gcc-patches
Hi Jakub, On 16.05.23 13:00, Jakub Jelinek wrote: On Tue, May 16, 2023 at 11:45:16AM +0200, Frederik Harwath wrote: The place where different compilers implement the loop transformations was discussed in an OpenMP loop transformation meeting last year. Two compilers (another one and GCC with

Re: [PATCH 0/7] openmp: OpenMP 5.1 loop transformation directives

2023-05-16 Thread Frederik Harwath via Gcc-patches
Hi Jakub, On 15.05.23 12:19, Jakub Jelinek wrote: On Fri, Mar 24, 2023 at 04:30:38PM +0100, Frederik Harwath wrote: this patch series implements the OpenMP 5.1 "unroll" and "tile" constructs. It includes changes to the C,C++, and Fortran front end for parsing the new

[PATCH] Docs, OpenMP: Small fixes to internal OMP_FOR doc

2023-04-19 Thread Frederik Harwath via Gcc-patches
= 0;   return D_2064; } (Strictly speaking, the OMP_FOR is represented as a gomp_for at this point, but this does not really matter.) Can I commit the patch? Best regards, Frederik From 8af01114c295086526a67f56f6256fc945b1ccb5 Mon Sep 17 00:00:00 2001 From: Frederik Harwath Date: Wed, 19 Apr 2023 13

Re: [PATCH 1/7] openmp: Add Fortran support for "omp unroll" directive

2023-04-06 Thread Frederik Harwath via Gcc-patches
ay, even with 100 of repeated test executions ;-). Best regards, Frederik From 3f471ed293d2e97198a65447d2f0d2bb69a2f305 Mon Sep 17 00:00:00 2001 From: Frederik Harwath Date: Thu, 6 Apr 2023 14:52:07 +0200 Subject: [PATCH] openmp: Fix loop transformation tests libgomp/ChangeLog: * testsuite/

[PATCH 7/7] openmp: Add C/C++ support for loop transformations on inner loops

2023-03-24 Thread Frederik Harwath
Add the parsing of loop transformations on inner loops of a loop-nest. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_nested_loop_transform_clauses): Add argument for the level of loop-nest at which the clauses appear, ... (c_parser_omp_tile): ... adjust use here,

[PATCH 6/7] openmp: Add Fortran support for loop transformations on inner loops

2023-03-24 Thread Frederik Harwath
So far the implementation of the "omp tile" and "omp unroll" directives restricted their use to the outermost loop of a loop-nest. This commit changes the Fortran front end to parse and verify the directives on inner loops. The transformation clauses are extended to carry the information about the

[PATCH 5/7] openmp: Add C/C++ support for "omp tile"

2023-03-24 Thread Frederik Harwath
This commit adds the C and C++ front end support for the "omp tile" directive. gcc/c-family/ChangeLog: * c-omp.cc (c_omp_directives): Add PRAGMA_OMP_TILE. * c-pragma.cc (omp_pragmas_simd): Likewise. * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_TILE. (enum pragma

[PATCH 4/7] openmp: Add Fortran support for "omp tile"

2023-03-24 Thread Frederik Harwath
This commit implements the Fortran front end support for the "omp tile" directive and the corresponding middle end transformation. gcc/fortran/ChangeLog: * gfortran.h (enum gfc_statement): Add ST_OMP_TILE, ST_OMP_END_TILE. (enum gfc_exec_op): Add EXEC_OMP_TILE. (loop_trans

[PATCH 3/7] openacc: Rename OMP_CLAUSE_TILE to OMP_CLAUSE_OACC_TILE

2023-03-24 Thread Frederik Harwath
OMP_CLAUSE_TILE will be used for the OpenMP 5.1 loop transformation construct "omp tile". gcc/ChangeLog: * tree-core.h (enum omp_clause_code): Rename OMP_CLAUSE_TILE. * tree.h (OMP_CLAUSE_TILE_LIST): Rename to ... (OMP_CLAUSE_OACC_TILE_LIST): ... this. (OMP_CLAUSE_

[PATCH 2/7] openmp: Add C/C++ support for "omp unroll" directive

2023-03-24 Thread Frederik Harwath
This commit implements the C and the C++ front end changes to support the "omp unroll" directive. The execution of the loop transformation relies on the pass that has been added as a part of the earlier Fortran patch. gcc/c-family/ChangeLog: * c-gimplify.cc (c_genericize_control_stmt): H

[PATCH 0/7] openmp: OpenMP 5.1 loop transformation directives

2023-03-24 Thread Frederik Harwath
with both nvptx-none and amdgcn-amdhsa offloading. Best regards, Frederik Frederik Harwath (7): openmp: Add Fortran support for "omp unroll" directive openmp: Add C/C++ support for "omp unroll" directive openacc: Rename OMP_CLAUSE_TILE to OMP_CLAUSE_OACC_TILE openmp: A

[PATCH 40/40] openacc: Adjust testsuite to new "kernels" handling

2021-12-16 Thread Frederik Harwath
Adjust the testsuite to changed expectations with the new Graphite-based "kernels" handling. libgomp/ChangeLog: * testsuite/libgomp.oacc-c++/privatized-ref-2.C: Adjust. * testsuite/libgomp.oacc-c++/privatized-ref-3.C: Adjust. * testsuite/libgomp.oacc-c-c++-common/acc_prof

[PATCH 39/40] openacc: Check type for references in reduction lowering

2021-12-15 Thread Frederik Harwath
gcc/ChangeLog: * omp-low.c (lower_oacc_reductions): Only create a reference if variable has pointer type. --- gcc/omp-low.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/gcc/omp-low.c b/gcc/omp-low.c index ae5cdfc5e260..2b8b848ec03a 100644 --- a/gcc/omp-

[PATCH 38/40] openacc: fix privatization of by-reference arrays

2021-12-15 Thread Frederik Harwath
From: Tobias Burnus Replacing of a by-reference variable in a private clause by a local variable makes sense; however, for arrays, the size is not directly known by the type. This causes an ICE via create_tmp_var which indirectly invokes force_constant_size in this case - but the latter only hand

[PATCH 37/40] Fix for is_gimple_reg vars to 'data kernels'

2021-12-15 Thread Frederik Harwath
From: Tobias Burnus Nearly all variable mapping is moved from 'kernels' to a surrounding 'data kernels' and then 'force_present' mapped for the 'kernels'. However, as libgomp.oacc-c-c++-common/declare-vla.c shows, moving 'int i, N' will fail as there is a special case for is_gimple_reg in mapping

[PATCH 36/40] openacc: Enable reduction variable localization for "kernels"

2021-12-15 Thread Frederik Harwath
gcc/ChangeLog: * gimplify.c (gimplify_omp_for): Enable localization on "kernels" regions. (gimplify_omp_workshare): Likewise. --- gcc/gimplify.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index bf37388f

[PATCH 35/40] Handle references in OpenACC "private" clauses

2021-12-15 Thread Frederik Harwath
From: Julian Brown gcc/ * gimplify.c (localize_reductions): Rewrite references for OMP_CLAUSE_PRIVATE also. libgomp/ * testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: New test. * testsuite/libgomp.oacc-c++/privatized-ref-2.C: New test.

[PATCH 34/40] Use more appropriate var in localize_reductions call

2021-12-15 Thread Frederik Harwath
From: Julian Brown gcc/ * gimplify.c (gimplify_omp_for): Use for_stmt in call to localize_reductions. --- gcc/gimplify.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 04ffbc256442..daa69ccf6202 100644 --- a/gcc

[PATCH 33/40] Fix tree check failure with reduction localization

2021-12-15 Thread Frederik Harwath
From: Julian Brown gcc/ * gimplify.c (gimplify_omp_workshare): Use OMP_CLAUSES, OMP_BODY instead of OMP_TARGET_CLAUSES, OMP_TARGET_BODY. --- gcc/gimplify.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 9a4331c7

[PATCH 32/40] Reference reduction localization

2021-12-15 Thread Frederik Harwath
From: Julian Brown gcc/ * gimplify.c (privatize_reduction): New struct. (localize_reductions_r, localize_reductions): New functions. (gimplify_omp_for): Call localize_reductions. (gimplify_omp_workshare): Likewise. * omp-low.c (lower_oacc_reductions

[PATCH 31/40] graphite: Accept loops without data references

2021-12-15 Thread Frederik Harwath
It seems that the check that rejects loops without data references is only included to avoid handling non-profitable loops. Including those loops in Graphite's analysis enables more consistent diagnostic messages in OpenACC "kernels" code and does not introduce any testsuite regressions. If execu

[PATCH 30/40] graphite: Adjust scop loop-nest choice

2021-12-15 Thread Frederik Harwath
The find_common_loop function is used in Graphite to obtain a common super-loop of all loops inside a SCoP. The function is applied to the loop of the destination block of the edge that leads into the SESE region and the loop of the source block of the edge that exits the region. The exit block i

[PATCH 29/40] graphite: Tune parameters for OpenACC use

2021-12-15 Thread Frederik Harwath
The default values of some parameters that restrict Graphite's resource usage are too low for many OpenACC codes. Furthermore, exceeding the limits does not alwas lead to user-visible diagnostic messages. This commit increases the parameter values on OpenACC functions. The values were chosen to

[PATCH 28/40] openacc: Disable pass_pre on outlined functions analyzed by Graphite

2021-12-15 Thread Frederik Harwath
The additional dependences introduced by partial redundancy elimination proper and by the code hoisting step of the pass very often cause Graphite to fail on OpenACC functions. On the other hand, the pass can also enable the analysis of OpenACC loops (cf. e.g. the loop-auto-transfer-4.f90 testcase)

[PATCH 27/40] openacc: Handle internal function calls in pass_lim

2021-12-15 Thread Frederik Harwath
The loop invariant motion pass correctly refuses to move statements out of a loop if any other statement in the loop is unanalyzable. The pass does not know how to handle the OpenACC internal function calls which was not necessary until recently when the OpenACC device lowering pass was moved to a

[PATCH 25/40] openacc: Add runtime alias checking for OpenACC kernels

2021-12-15 Thread Frederik Harwath
From: Andrew Stubbs This commit adds the code generation for the runtime alias checks for OpenACC loops that have been analyzed by Graphite. The runtime alias check condition gets generated in Graphite. It is evaluated by the code generated for the IFN_GOACC_LOOP internal function calls. If ali

[PATCH 26/40] openacc: Warn about "independent" "kernels" loops with data-dependences

2021-12-15 Thread Frederik Harwath
This commit concerns loops in OpenACC "kernels" region that have been marked up with an explicit "independent" clause by the user, but for which Graphite found data dependences. A discussion on the private internal OpenACC mailing list suggested that warning the user about the dependences woud be

[PATCH 24/40] openacc: Add data optimization pass

2021-12-15 Thread Frederik Harwath
From: Andrew Stubbs Address PR90591 "Avoid unnecessary data transfer out of OMP construct", for simple (but common) cases. This commit adds a pass that optimizes data mapping clauses. Currently, it can optimize copy/map(tofrom) clauses involving scalars to copyin/map(to) and further to "private"

[PATCH 23/40] Add function for printing a single OMP_CLAUSE

2021-12-15 Thread Frederik Harwath
Commit 89f4f339130c ("For 'OMP_CLAUSE' in 'dump_generic_node', dump the whole OMP clause chain") changed the dumping behavior for OMP_CLAUSEs. The old behavior is required for a follow-up commit ("openacc: Add data optimization pass") that optimizes single OMP_CLAUSEs. gcc/ChangeLog: * t

[PATCH 22/40] openacc: Remove unused partitioning in "kernels" regions

2021-12-15 Thread Frederik Harwath
With the old "kernels" handling, unparallelized regions would get executed with 1x1x1 partitioning even if the user provided explicit num_gangs, num_workers clauses etc. This commit restores this behavior by removing unused partitioning after assigning the parallelism dimensions to loops. gcc/Cha

[PATCH 21/40] openacc: Add "can_be_parallel" flag info to "graph" dumps

2021-12-15 Thread Frederik Harwath
gcc/ChangeLog: * graph.c (oacc_get_fn_attrib): New declaration. (find_loop_location): New declaration. (draw_cfg_nodes_for_loop): Print value of the can_be_parallel flag at the top of loops in OpenACC functions. --- gcc/graph.c | 35

[PATCH 19/40] graphite: Add runtime alias checking

2021-12-15 Thread Frederik Harwath
Graphite rejects a SCoP if it contains a pair of data references for which it cannot determine statically if they may alias. This happens very often, for instance in C code which does not use explicit "restrict". This commit adds the possibility to analyze a SCoP nevertheless and perform an alias

[PATCH 18/40] Move compute_alias_check_pairs to tree-data-ref.c

2021-12-15 Thread Frederik Harwath
Move this function from tree-loop-distribution.c to tree-data-ref.c and make it non-static to enable its use from other parts of GCC. gcc/ChangeLog: * tree-loop-distribution.c (data_ref_segment_size): Remove function. (latch_dominated_by_data_ref): Likewise. (compute_alias_

[PATCH 14/40] openacc: Move pass_oacc_device_lower after pass_graphite

2021-12-15 Thread Frederik Harwath
The OpenACC device lowering pass must run after the Graphite pass to allow for the use of Graphite for automatic parallelization of kernels regions in the future. Experimentation has shown that it is best, performancewise, to run pass_oacc_device_lower together with the related passes pass_oacc_loo

[PATCH 17/40] graphite: Fix minor mistakes in comments

2021-12-15 Thread Frederik Harwath
gcc/ChangeLog: * graphite-sese-to-poly.c (build_poly_sr_1): Fix a typo and a reference to a variable which does not exist. * graphite-isl-ast-to-gimple.c (gsi_insert_earliest): Fix typo in comment. --- gcc/graphite-isl-ast-to-gimple.c | 2 +- gcc/graphite-sese-to-p

[PATCH 16/40] graphite: Rename isl_id_for_ssa_name

2021-12-15 Thread Frederik Harwath
The SSA names for which this function gets used are always SCoP parameters and hence "isl_id_for_parameter" is a better name. It also explains the prefix "P_" for those names in the ISL representation. gcc/ChangeLog: * graphite-sese-to-poly.c (isl_id_for_ssa_name): Rename to ...

[PATCH 13/40] Fortran: Delinearize array accesses

2021-12-15 Thread Frederik Harwath
The Fortran front end presently linearizes accesses to multi-dimensional arrays by combining the indices for the various dimensions into a series of explicit multiplies and adds with refactoring to allow CSE of invariant parts of the computation. Unfortunately this representation interferes with Gr

[PATCH 15/40] graphite: Extend SCoP detection dump output

2021-12-15 Thread Frederik Harwath
Extend dump output to make understanding why Graphite rejects to include a loop in a SCoP easier (for GCC developers). ChangeLog: * graphite-scop-detection.c (scop_detection::can_represent_loop): Output reason for failure to dump file. (scop_detection::harmful_loop_in_regi

[PATCH 12/40] Relax some restrictions on the loop bound in kernels loop annotation.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore OpenACC loop semantics require that the loop bound be computable before entering the loop, rather than the C/C++ semantics where the end test is evaluated on every iteration. Formerly the kernels loop annotater permitted only constants and variables not modified in the loo

[PATCH 11/40] Clean up loop variable extraction in OpenACC kernels loop annotation.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore The code for identifying annotatable loops in OpenACC kernels regions previously looked for the loop variable as the left-hand side of the comparison in the loop end test. However, front end optimizations sometimes switch the sense of the comparison, making this method unr

[PATCH 10/40] Fix patterns in Fortran tests for kernels loop annotation.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore Several of the Fortran tests for kernels loop annotation were failing due to changes in the formatting of "acc loop" constructs in the dump file. Now the "auto" clause appears first, instead of after "private". 2020-08-23 Sandra Loosemore gcc/testsuite/

[PATCH 09/40] Permit calls to builtins and intrinsics in kernels loops.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore This tweak to the OpenACC kernels loop annotation relaxes the restrictions on function calls in the loop body. Normally calls to functions not explicitly marked with a parallelism attribute are not permitted, but C/C++ builtins and Fortran intrinsics have known semantics s

[PATCH 08/40] Annotate inner loops in "acc kernels loop" directives (Fortran).

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore Normally explicit loop directives in a kernels region inhibit automatic annotation of other loops in the same nest, on the theory that users have indicated they want manual control over that section of code. However there seems to be an expectation in user code that the co

[PATCH 07/40] Annotate inner loops in "acc kernels loop" directives (C/C++).

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore Normally explicit loop directives in a kernels region inhibit automatic annotation of other loops in the same nest, on the theory that users have indicated they want manual control over that section of code. However there seems to be an expectation in user code that the co

[PATCH 06/40] Add a "combined" flag for "acc kernels loop" etc directives.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore 2020-08-19 Sandra Loosemore gcc/ * tree.h (OACC_LOOP_COMBINED): New. gcc/c/ * c-parser.c (c_parser_oacc_loop): Set OACC_LOOP_COMBINED. gcc/cp/ * parser.c (cp_parser_oacc_loop): Set OACC_LOOP_COMBINED. gcc/fortra

[PATCH 05/40] Fix bug in processing of array dimensions in data clauses.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore The g++ front end wraps the array length and low_bound values in NON_LVALUE_EXPR, causing the subsequent tests for INTEGER_CST to fail. The test case c-c++-common/goacc/kernels-loop-annotation-1.c was tickling this bug and giving bogus errors in g++ because it was falling t

[PATCH 04/40] Additional Fortran testsuite fixes for kernels loops annotation pass.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore 2020-03-27 Sandra Loosemore gcc/testsuite/ * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Adjust line numbering. * gfortran.dg/goacc/classify-kernels.f95: Likewise. * gfortran.dg/goacc/kernels-decompose-2.f95: Add

[PATCH 03/40] Kernels loops annotation: Fortran.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore This patch implements the Fortran support for adding "#pragma acc loop auto" annotations to loops in OpenACC kernels regions. It implements the same -fopenacc-kernels-annotate-loops and -Wopenacc-kernels-annotate-loops options that were previously added (and documented) fo

[PATCH 01/40] Kernels loops annotation: C and C++.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore This patch detects loops in kernels regions that are candidates for parallelization, and adds "#pragma acc loop auto" annotations to them. This annotation is controlled by the -fopenacc-kernels-annotate-loops option, which is enabled by default. -Wopenacc-kernels-annotate-

[PATCH 02/40] Add -fno-openacc-kernels-annotate-loops option to more testcases.

2021-12-15 Thread Frederik Harwath
From: Sandra Loosemore 2020-03-27 Sandra Loosemore gcc/testsuite/ * c-c++-common/goacc/kernels-decompose-2.c: Add -fno-openacc-kernels-annotate-loops. --- gcc/testsuite/c-c++-common/goacc/kernels-decompose-2.c | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/te

[PATCH 00/40] OpenACC "kernels" Improvements

2021-12-15 Thread Frederik Harwath
can discuss some of the changes before they can be considered for inclusion in GCC during the next stage 1. Best regards, Frederik Andrew Stubbs (2): openacc: Add data optimization pass openacc: Add runtime a lias checking for OpenACC kernels Frederik Harwath (20): Fortran: Delineariz

[OG11][committed][PATCH 21/22] graphite: Accept loops without data references

2021-11-17 Thread Frederik Harwath
It seems that the check that rejects loops without data references is only included to avoid handling non-profitable loops. Including those loops in Graphite's analysis enables more consistent diagnostic messages in OpenACC "kernels" code and does not introduce any testsuite regressions. If execu

[OG11][committed][PATCH 20/22] graphite: Adjust scop loop-nest choice

2021-11-17 Thread Frederik Harwath
The find_common_loop function is used in Graphite to obtain a common super-loop of all loops inside a SCoP. The function is applied to the loop of the destination block of the edge that leads into the SESE region and the loop of the source block of the edge that exits the region. The exit block i

[OG11][committed][PATCH 19/22] graphite: Tune parameters for OpenACC use

2021-11-17 Thread Frederik Harwath
The default values of some parameters that restrict Graphite's resource usage are too low for many OpenACC codes. Furthermore, exceeding the limits does not alwas lead to user-visible diagnostic messages. This commit increases the parameter values on OpenACC functions. The values were chosen to

[OG11][committed][PATCH 18/22] openacc: Disable pass_pre on outlined functions analyzed by Graphite

2021-11-17 Thread Frederik Harwath
The additional dependences introduced by partial redundancy elimination proper and by the code hoisting step of the pass very often cause Graphite to fail on OpenACC functions. On the other hand, the pass can also enable the analysis of OpenACC loops (cf. e.g. the loop-auto-transfer-4.f90 testcase)

[OG11][committed][PATCH 17/22] openacc: Handle internal function calls in pass_lim

2021-11-17 Thread Frederik Harwath
The loop invariant motion pass correctly refuses to move statements out of a loop if any other statement in the loop is unanalyzable. The pass does not know how to handle the OpenACC internal function calls which was not necessary until recently when the OpenACC device lowering pass was moved to a

[OG11][committed][PATCH 16/22] openacc: Warn about "independent" "kernels" loops with data-dependences

2021-11-17 Thread Frederik Harwath
This commit concerns loops in OpenACC "kernels" region that have been marked up with an explicit "independent" clause by the user, but for which Graphite found data dependences. A discussion on the private internal OpenACC mailing list suggested that warning the user about the dependences woud be

[OG11][committed][PATCH 14/22] openacc: Add data optimization pass

2021-11-17 Thread Frederik Harwath
From: Andrew Stubbs Address PR90591 "Avoid unnecessary data transfer out of OMP construct", for simple (but common) cases. This commit adds a pass that optimizes data mapping clauses. Currently, it can optimize copy/map(tofrom) clauses involving scalars to copyin/map(to) and further to "private"

[OG11][committed][PATCH 15/22] openacc: Add runtime alias checking for OpenACC kernels

2021-11-17 Thread Frederik Harwath
From: Andrew Stubbs This commit adds the code generation for the runtime alias checks for OpenACC loops that have been analyzed by Graphite. The runtime alias check condition gets generated in Graphite. It is evaluated by the code generated for the IFN_GOACC_LOOP internal function calls. If ali

[OG11][committed][PATCH 13/22] Add function for printing a single OMP_CLAUSE

2021-11-17 Thread Frederik Harwath
Commit 89f4f339130c ("For 'OMP_CLAUSE' in 'dump_generic_node', dump the whole OMP clause chain") changed the dumping behavior for OMP_CLAUSEs. The old behavior is required for a follow-up commit ("openacc: Add data optimization pass") that optimizes single OMP_CLAUSEs. gcc/ChangeLog: * t

[OG11][committed][PATCH 11/22] openacc: Add further kernels tests

2021-11-17 Thread Frederik Harwath
Add some copies of tests to continue covering the old "parloops"-based "kernels" implementation - until it gets removed from GCC - and add further tests for the new Graphite-based implementation. libgomp/ChangeLog: * testsuite/libgomp.oacc-fortran/parallel-loop-auto-reduction-2.f90:

[OG11][committed][PATCH 12/22] openacc: Remove unused partitioning in "kernels" regions

2021-11-17 Thread Frederik Harwath
With the old "kernels" handling, unparallelized regions would get executed with 1x1x1 partitioning even if the user provided explicit num_gangs, num_workers clauses etc. This commit restores this behavior by removing unused partitioning after assigning the parallelism dimensions to loops. gcc/Cha

[OG11][committed][PATCH 10/22] openacc: Add "can_be_parallel" flag info to "graph" dumps

2021-11-17 Thread Frederik Harwath
gcc/ChangeLog: * graph.c (oacc_get_fn_attrib): New declaration. (find_loop_location): New declaration. (draw_cfg_nodes_for_loop): Print value of the can_be_parallel flag at the top of loops in OpenACC functions. --- gcc/graph.c | 35

[OG11][committed][PATCH 08/22] graphite: Add runtime alias checking

2021-11-17 Thread Frederik Harwath
Graphite rejects a SCoP if it contains a pair of data references for which it cannot determine statically if they may alias. This happens very often, for instance in C code which does not use explicit "restrict". This commit adds the possibility to analyze a SCoP nevertheless and perform an alias

[OG11][committed][PATCH 07/22] Move compute_alias_check_pairs to tree-data-ref.c

2021-11-17 Thread Frederik Harwath
Move this function from tree-loop-distribution.c to tree-data-ref.c and make it non-static to enable its use from other parts of GCC. gcc/ChangeLog: * tree-loop-distribution.c (data_ref_segment_size): Remove function. (latch_dominated_by_data_ref): Likewise. (compute_alias_

[OG11][committed][PATCH 05/22] graphite: Fix minor mistakes in comments

2021-11-17 Thread Frederik Harwath
gcc/ChangeLog: * graphite-sese-to-poly.c (build_poly_sr_1): Fix a typo and a reference to a variable which does not exist. * graphite-isl-ast-to-gimple.c (gsi_insert_earliest): Fix typo in comment. --- gcc/graphite-isl-ast-to-gimple.c | 2 +- gcc/graphite-sese-

[OG11][committed][PATCH 04/22] graphite: Rename isl_id_for_ssa_name

2021-11-17 Thread Frederik Harwath
The SSA names for which this function gets used are always SCoP parameters and hence "isl_id_for_parameter" is a better name. It also explains the prefix "P_" for those names in the ISL representation. gcc/ChangeLog: * graphite-sese-to-poly.c (isl_id_for_ssa_name): Rename to ...

[OG11][committed][PATCH 02/22] openacc: Move pass_oacc_device_lower after pass_graphite

2021-11-17 Thread Frederik Harwath
The OpenACC device lowering pass must run after the Graphite pass to allow for the use of Graphite for automatic parallelization of kernels regions in the future. Experimentation has shown that it is best, performancewise, to run pass_oacc_device_lower together with the related passes pass_oacc_loo

[OG11][committed][PATCH 03/22] graphite: Extend SCoP detection dump output

2021-11-17 Thread Frederik Harwath
Extend dump output to make understanding why Graphite rejects to include a loop in a SCoP easier (for GCC developers). ChangeLog: * graphite-scop-detection.c (scop_detection::can_represent_loop): Output reason for failure to dump file. (scop_detection::harmful_loop_in_regi

[OG11][committed][PATCH 01/22] Fortran: delinearize multi-dimensional array accesses

2021-11-17 Thread Frederik Harwath
From: Sandra Loosemore The Fortran front end presently linearizes accesses to multi-dimensional arrays by combining the indices for the various dimensions into a series of explicit multiplies and adds with refactoring to allow CSE of invariant parts of the computation. Unfortunately this represen

[OG11][committed][PATCH 00/22] OpenACC "kernels" Improvements

2021-11-17 Thread Frederik Harwath
ctors" and two trivial unrelated commits "fa558c2a6664 Fix gimple_debug_cfg declaration" and "35cdc94463fe Fix branch prediction dump message" Andrew Stubbs (2): openacc: Add data optimization pass openacc: Add runtime alias checking for OpenACC kernels Frederik Harwath

Re: [PATCH 1/2] [WIP] OpenACC: Add Graphite-base handling of "auto" loops

2020-11-16 Thread Frederik Harwath
Hi Richard, Richard Biener writes: > On Thu, Nov 12, 2020 at 11:11 AM Frederik Harwath > wrote: >> >> This patch enables the use of Graphite for the analysis of OpenACC >> "auto" loops. [...] >> Furthermore, Graphite is extended by functionality t

[PATCH 2/2] OpenACC: Add Graphite-based "kernels" handling to pass_convert_oacc_kernels

2020-11-12 Thread Frederik Harwath
This patch changes the "kernels" conversion to route loops in OpenACC "kernels" regions through Graphite. This is done by converting the loops in "kernels" regions which are not yet known to be "independent" to "auto" loops as in the current (OG10) "parloops" based "kernels" handling. Afterwards,

[PATCH 1/2] [WIP] OpenACC: Add Graphite-base handling of "auto" loops

2020-11-12 Thread Frederik Harwath
This patch enables the use of Graphite for the analysis of OpenACC "auto" loops. The goal is to decide if a loop may be parallelized (i.e. converted to an "independent" loop) or not. Graphite and the functionality on which it relies (scalar evolution, data references) are extended to interpret t

[PATCH 0/2] Use Graphite for OpenACC "kernels" regions

2020-11-12 Thread Frederik Harwath
ass which converts OpenACC kernels regions to parallel regions from OG10 (commit 809ea59722263eb6c2d48402e1eed80727134038). Best regards, Frederik Frederik Harwath (2): [WIP] OpenACC: Add Graphite-based handling of "auto" loops OpenACC: Add Graphite-based "kernels

Re: Move pass_oacc_device_lower after pass_graphite

2020-11-06 Thread Frederik Harwath
Hi Richard, Richard Biener writes: > On Tue, Nov 3, 2020 at 4:31 PM Frederik Harwath > What's on my TODO list (or on the list of things to explore) is to make > the dump file names/suffixes explicit in passes.def like via > > NEXT_PASS (pass_ccp, true /* nonzero_p */,

[PATCH] testsuite: Clean up lto and offload dump files

2020-11-04 Thread Frederik Harwath
gistergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter >From 9eb5da60e8822e1f6fa90b32bff6123ed62c146c Mon Sep 17 00:00:00 2001 From: Frederik Harwath Date: Wed, 4 Nov 2020 14:09:46 +0100 Subject: [PATCH] testsuite: Clean up lto and offload dump files Dump files produced from an

Move pass_oacc_device_lower after pass_graphite

2020-11-03 Thread Frederik Harwath
om 93fb166876a0540416e19c9428316d1370dd1e1b Mon Sep 17 00:00:00 2001 From: Frederik Harwath Date: Tue, 3 Nov 2020 12:58:37 +0100 Subject: [PATCH] Move pass_oacc_device_lower after pass_graphite As a first step towards enabling the use of Graphite for optimizing OpenACC loops, the OpenACC device lo

Re: [PATCH] [og10] libgomp, Fortran: Fix OpenACC "gang reduction on an orphan loop" error message

2020-07-20 Thread Frederik Harwath
d) GmbH, Arnulfstraße 201, 80634 München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter >From 7c10ae450b95495dda362cb66770bb78b546592e Mon Sep 17 00:00:00 2001 From: Frederik Harwath Date: Mon, 20 Jul 2020 11:24:21 +0200 Subject: [PATCH] libgomp, Fortra

Re: [PATCH] [og10] libgomp, Fortran: Fix OpenACC "gang reduction on an orphan loop" error message

2020-07-07 Thread Frederik Harwath
Thomas Schwinge writes: Hi Thomas, > (CC added, for everything touching gfortran.) Thanks! > On 2020-07-07T10:52:08+0200, Frederik Harwath > wrote: >> This patch fixes the check for reductions on orphaned gang loops > > This is the "Make OpenACC orphan gang reduct

[PATCH] [og10] libgomp, Fortran: Fix OpenACC "gang reduction on an orphan loop" error message

2020-07-07 Thread Frederik Harwath
München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter >From 7320635211fff3a773beb0de1914dbfcc317ab37 Mon Sep 17 00:00:00 2001 From: Frederik Harwath Date: Tue, 7 Jul 2020 10:41:21 +0200 Subject: [PATCH] libgomp, Fortran: Fix OpenACC "gang

PING Re: testsuite: clarify scan-dump file globbing behavior

2020-06-02 Thread Frederik Harwath
Frederik Harwath writes: ping :-) > Frederik Harwath writes: > > Hi Rainer, hi Mike, > ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545803.html > > Best regards, > Frederik > >> Hi Thomas, >> >> Thomas Schwinge writes: >> >>&

Re: testsuite: clarify scan-dump file globbing behavior

2020-05-25 Thread Frederik Harwath
Frederik Harwath writes: Hi Rainer, hi Mike, ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545803.html Best regards, Frederik > Hi Thomas, > > Thomas Schwinge writes: > >> I can't formally approve testsuite patches, but did a review anyway: > > Thanks

Re: [PATCH] contrib/gcc-changelog: Handle Reviewed-{by,on}

2020-05-19 Thread Frederik Harwath
Martin Liška writes: Hi Martin, > On 5/19/20 11:45 AM, Frederik Harwath wrote: > Thank you Frederick for the patch. > > Looking at what I grepped: > https://github.com/marxin/gcc-changelog/issues/1#issuecomment-621910248 I get a 404 error when I try to access this URL. The repos

[PATCH] contrib/gcc-changelog: Handle Reviewed-{by,on}

2020-05-19 Thread Frederik Harwath
ommit the patch? Best regards, Frederik - Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter >From 0dc9b201bc1607de36cb9b3604a87cc3646292e3 Mon Sep 17 00:00:00 2001 From: Frederik Har

Re: testsuite: clarify scan-dump file globbing behavior

2020-05-19 Thread Frederik Harwath
Hi Thomas, Thomas Schwinge writes: > I can't formally approve testsuite patches, but did a review anyway: Thanks for the review! > On 2020-05-15T12:31:54+0200, Frederik Harwath > wrote: >> The dump >> scanning procedures are changed to make the test unresolved &

testsuite: clarify scan-dump file globbing behavior

2020-05-15 Thread Frederik Harwath
e than one file (due to an attempt to call "open" on multiple files) while a failure to match any file results in an unresolved test. This commit documents the globbing behavior. The dump scanning procedures are changed to make the test unresolved if globbing matches more than o

Re: [og8] Report errors on missing OpenACC reduction clauses in nested reductions

2020-04-21 Thread Frederik Harwath
ot;scan_omp_for". I have executed "make check" (on x86_64-linux-gnu) to verify that the change causes no regressions. Ok to push the commit to master? Best regards, Frederik - Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany Registergeric

Re: [og9] Really fix og9 "Fix hang when running oacc exec with CUDA 9.0 nvprof"

2020-03-27 Thread Frederik Harwath
Hi Thomas, Thomas Schwinge writes: > On 2020-03-25T18:09:25+0100, I wrote: >> On 2018-02-22T12:23:25+0100, Tom de Vries wrote: >>> when using cuda 9 nvprof with an openacc executable, the executable hangs. > >> What Frederik has discovered today in the hard way... [...] >> -- the hang was bac

Re: [C/C++, OpenACC] Reject vars of different scope in acc declare (PR94120)

2020-03-12 Thread Frederik Harwath
Tobias Burnus writes: Hi Tobias, > Fortran patch: https://gcc.gnu.org/pipermail/gcc-patches/current/541774.html > > "A declare directive must be in the same scope > as the declaration of any var that appears in > the data clauses of the directive." > > ("A declare directive is used […] follo

[PATCH 2/2] Add tests to verify OpenACC clause locations

2019-12-10 Thread Frederik Harwath
Check that the column information for OpenACC clauses is communicated correctly to the middle-end, in particular by the Fortran front-end (cf. PR 92793). 2019-12-10 Frederik Harwath gcc/testsuite/ * gcc.dg/goacc/clause-locations.c: New test. * gfortran.dg/goacc/clause

[PATCH 0/2] Add tests to verify OpenACC clause locations

2019-12-10 Thread Frederik Harwath
ok to include them in trunk? Best regards, Frederik Frederik Harwath (2): Use clause locations in OpenACC nested reduction warnings Add tests to verify OpenACC clause locations gcc/omp-low.c | 2 +- gcc/testsuite/gcc.dg/goacc/clause-locations.c | 17 +++

  1   2   >