On 12/02/16 12:10, Tom de Vries wrote:
On 26/01/16 13:49, Jakub Jelinek wrote:
On Tue, Jan 26, 2016 at 01:38:39PM +0100, Tom de Vries wrote:
Ping^3. ( https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01089.html )
First of all, I wonder if it wouldn't be far easier to handle these
during
gimplification rather than during omp expansion or during parsing.
Inside
kernels, do you need to honor any clauses on the acc loop, like
privatization etc., or can you just ignore it altogether (after
parsing them
to ensure they are valid)?
The oacc loop clauses are: gang, worker, vector, seq, auto, tile,
device_type, independent, private, reduction.
AFAIU, there're all safe to ignore. That has largely been the approach
in the gomp-4_0-branch, and sofar I haven't seen any failures due to
ignoring a loop clause in a kernels region.
But we do want to be able to honor loop clauses in a kernels region at
some point. F.i., supporting the independent clause would allow more
test-cases to be parallelized.
At some point we had an implementation of the independent clause in the
gomp-4_0-branch, but that had to be reverted (
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00696.html ).
Anyway, the implementation of the propagation of the independent
property was to keep the loop directive with the independent clause
until omp-expand (where we have cfg), and set a new field
marked_independent in the corresponding struct loop.
If we want to do the expansion of the loop directive to a normal loop at
gimplication, I see two issues:
- in general, we don't only check for correctness during parsing,
there's also checking being done during scan_omp, which happens in
pass_lower_omp, after gimplification.
- how do we mark the new loop as being independent?
Handling this in expand_omp_for_generic is not really nice, because it
will
make already very complicated function even more complex.
An alternative would be to copy expand_omp_for_generic, apply the patch,
and partially evaluate for the single call introduced in the patch.
Do you prefer this approach?
Jakub,
Following up on your suggestion to implement this during gimplification,
I wrote attached patch.
I'll put it through some openacc testing and add testcases. Is this
approach acceptable for stage4?
Thanks,
- Tom
Ignore acc loop directive in kernels region
---
gcc/gimplify.c | 85 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 85 insertions(+)
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 7be6bd7..cec0627 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -8364,6 +8364,82 @@ find_combined_omp_for (tree *tp, int *walk_subtrees, void *)
return NULL_TREE;
}
+/* Gimplify the loops with index I and higher in omp_for FOR_STMT as a
+ sequential loop, and append the resulting gimple statements to PRE_P. */
+
+static void
+gimplify_omp_for_seq (tree for_stmt, gimple_seq *pre_p, unsigned int i)
+{
+ gcc_assert (OMP_FOR_ORIG_DECLS (for_stmt) == NULL_TREE);
+ unsigned int len = TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt));
+ gcc_assert (i < len);
+
+ /* Gimplify OMP_FOR[i] as:
+
+ if (i == 0)
+ OMP_FOR_PRE_BODY;
+ OMP_FOR_INIT[i];
+ goto <loop_entry_label>;
+ <fall_thru_label>:
+ if (i == len - 1)
+ OMP_FOR_BODY;
+ else
+ OMP_FOR[i+1];
+ OMP_FOR_INCR[i];
+ <loop_entry_label>:
+ if (OMP_FOR_COND[i])
+ goto <fall_thru_label>;
+ else
+ goto <loop_exit_label>;
+ <loop_exit_label>:
+ */
+
+ tree loop_entry_label = create_artificial_label (UNKNOWN_LOCATION);
+ tree fall_thru_label = create_artificial_label (UNKNOWN_LOCATION);
+ tree loop_exit_label = create_artificial_label (UNKNOWN_LOCATION);
+
+ /* if (i = 0) OMP_FOR_PRE_BODY. */
+ if (i == 0)
+ gimplify_and_add (OMP_FOR_PRE_BODY (for_stmt), pre_p);
+
+ /* OMP_FOR_INIT[i]. */
+ tree init = TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), i);
+ gimplify_stmt (&init, pre_p);
+
+ /* goto <loop_entry_label>. */
+ gimplify_seq_add_stmt (pre_p, gimple_build_goto (loop_entry_label));
+
+ /* <fall_thru_label>. */
+ gimplify_seq_add_stmt (pre_p, gimple_build_label (fall_thru_label));
+
+ /* if (i == len - 1) OMP_FOR_BODY
+ else OMP_FOR[i+1]. */
+ if (i == len - 1)
+ gimplify_and_return_first (OMP_FOR_BODY (for_stmt), pre_p);
+ else
+ gimplify_omp_for_seq (for_stmt, pre_p, i + 1);
+
+ /* OMP_FOR_INCR[i]. */
+ tree incr = TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i);
+ gimplify_stmt (&incr, pre_p);
+
+ /* <loop_entry_label>. */
+ gimplify_seq_add_stmt (pre_p, gimple_build_label (loop_entry_label));
+
+ /* if (OMP_FOR_COND[i]) goto <fall_thru_label>
+ else goto <loop_exit_label>. */
+ tree cond = TREE_VEC_ELT (OMP_FOR_COND (for_stmt), i);
+ tree var = TREE_OPERAND (cond, 0);
+ tree final_val = TREE_OPERAND (cond, 1);
+ gimplify_expr (&final_val, pre_p, NULL, is_gimple_val, fb_rvalue);
+ gimple *gimple_cond = gimple_build_cond (TREE_CODE (cond), var, final_val,
+ fall_thru_label, loop_exit_label);
+ gimplify_seq_add_stmt (pre_p, gimple_cond);
+
+ /* <loop_exit_label>. */
+ gimplify_seq_add_stmt (pre_p, gimple_build_label (loop_exit_label));
+}
+
/* Gimplify the gross structure of an OMP_FOR statement. */
static enum gimplify_status
@@ -8403,6 +8479,15 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
gcc_unreachable ();
}
+ if (ort == ORT_ACC
+ && gimplify_omp_ctxp != NULL
+ && gimplify_omp_ctxp->region_type == ORT_ACC_KERNELS)
+ {
+ /* For now, ignore loop directive in kernels region. */
+ gimplify_omp_for_seq (for_stmt, pre_p, 0);
+ return GS_ALL_DONE;
+ }
+
/* Set OMP_CLAUSE_LINEAR_NO_COPYIN flag on explicit linear
clause for the IV. */
if (ort == ORT_SIMD && TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)) == 1)