On 09/11/15 16:35, Tom de Vries wrote:
Hi,
this patch series for stage1 trunk adds support to:
- parallelize oacc kernels regions using parloops, and
- map the loops onto the oacc gang dimension.
The patch series contains these patches:
1 Insert new exit block only when needed in
transform_to_exit_first_loop_alt
2 Make create_parallel_loop return void
3 Ignore reduction clause on kernels directive
4 Implement -foffload-alias
5 Add in_oacc_kernels_region in struct loop
6 Add pass_oacc_kernels
7 Add pass_dominator_oacc_kernels
8 Add pass_ch_oacc_kernels
9 Add pass_parallelize_loops_oacc_kernels
10 Add pass_oacc_kernels pass group in passes.def
11 Update testcases after adding kernels pass group
12 Handle acc loop directive
13 Add c-c++-common/goacc/kernels-*.c
14 Add gfortran.dg/goacc/kernels-*.f95
15 Add libgomp.oacc-c-c++-common/kernels-*.c
16 Add libgomp.oacc-fortran/kernels-*.f95
The first 9 patches are more or less independent, but patches 10-16 are
intended to be committed at the same time.
Bootstrapped and reg-tested on x86_64.
Build and reg-tested with nvidia accelerator, in combination with a
patch that enables accelerator testing (which is submitted at
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).
I'll post the individual patches in reply to this message.
In transform_to_exit_first_loop_alt we insert a new exit block in
between the new loop header and the old exit block. Currently, we also
do this if this is not necessary.
This patch figures out when we need to insert a new exit block, and only
then inserts it.
Thanks,
- Tom
Insert new exit block only when needed in transform_to_exit_first_loop_alt
2015-06-30 Tom de Vries <t...@codesourcery.com>
* tree-parloops.c (transform_to_exit_first_loop_alt): Insert new exit
block only when needed.
---
gcc/tree-parloops.c | 42 ++++++++++++++++++++++++++++--------------
1 file changed, 28 insertions(+), 14 deletions(-)
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index 3d41275..6a49aa9 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -1695,10 +1695,15 @@ transform_to_exit_first_loop_alt (struct loop *loop,
/* Set the latch arguments of the new phis to ivtmp/sum_b. */
flush_pending_stmts (post_inc_edge);
- /* Create a new empty exit block, inbetween the new loop header and the old
- exit block. The function separate_decls_in_region needs this block to
- insert code that is active on loop exit, but not any other path. */
- basic_block new_exit_block = split_edge (exit);
+
+ basic_block new_exit_block = NULL;
+ if (!single_pred_p (exit->dest))
+ {
+ /* Create a new empty exit block, inbetween the new loop header and the
+ old exit block. The function separate_decls_in_region needs this block
+ to insert code that is active on loop exit, but not any other path. */
+ new_exit_block = split_edge (exit);
+ }
/* Insert and register the reduction exit phis. */
for (gphi_iterator gsi = gsi_start_phis (exit_block);
@@ -1706,17 +1711,24 @@ transform_to_exit_first_loop_alt (struct loop *loop,
gsi_next (&gsi))
{
gphi *phi = gsi.phi ();
+ gphi *nphi = NULL;
tree res_z = PHI_RESULT (phi);
+ tree res_c;
- /* Now that we have a new exit block, duplicate the phi of the old exit
- block in the new exit block to preserve loop-closed ssa. */
- edge succ_new_exit_block = single_succ_edge (new_exit_block);
- edge pred_new_exit_block = single_pred_edge (new_exit_block);
- tree res_y = copy_ssa_name (res_z, phi);
- gphi *nphi = create_phi_node (res_y, new_exit_block);
- tree res_c = PHI_ARG_DEF_FROM_EDGE (phi, succ_new_exit_block);
- add_phi_arg (nphi, res_c, pred_new_exit_block, UNKNOWN_LOCATION);
- add_phi_arg (phi, res_y, succ_new_exit_block, UNKNOWN_LOCATION);
+ if (new_exit_block != NULL)
+ {
+ /* Now that we have a new exit block, duplicate the phi of the old
+ exit block in the new exit block to preserve loop-closed ssa. */
+ edge succ_new_exit_block = single_succ_edge (new_exit_block);
+ edge pred_new_exit_block = single_pred_edge (new_exit_block);
+ tree res_y = copy_ssa_name (res_z, phi);
+ nphi = create_phi_node (res_y, new_exit_block);
+ res_c = PHI_ARG_DEF_FROM_EDGE (phi, succ_new_exit_block);
+ add_phi_arg (nphi, res_c, pred_new_exit_block, UNKNOWN_LOCATION);
+ add_phi_arg (phi, res_y, succ_new_exit_block, UNKNOWN_LOCATION);
+ }
+ else
+ res_c = PHI_ARG_DEF_FROM_EDGE (phi, exit);
if (virtual_operand_p (res_z))
continue;
@@ -1724,7 +1736,9 @@ transform_to_exit_first_loop_alt (struct loop *loop,
gimple *reduc_phi = SSA_NAME_DEF_STMT (res_c);
struct reduction_info *red = reduction_phi (reduction_list, reduc_phi);
if (red != NULL)
- red->keep_res = nphi;
+ red->keep_res = (nphi != NULL
+ ? nphi
+ : phi);
}
/* We're going to cancel the loop at the end of gen_parallel_loop, but until
--
1.9.1