On Wed, Oct 21, 2015 at 03:42:26PM -0400, Nathan Sidwell wrote: > +/* Flags for an OpenACC loop. */ > + > +enum oacc_loop_flags > + {
Weird formatting. I see either enum foobarbaz { e1 = ..., e2 = ... }; or enum foobarbaz { e1 = ..., e2 = ... }; styles being used heavily, but not this one. > + OLF_SEQ = 1u << 0, /* Explicitly sequential */ > + OLF_AUTO = 1u << 1, /* Compiler chooses axes. */ > + OLF_INDEPENDENT = 1u << 2, /* Iterations are known independent. */ > + OLF_GANG_STATIC = 1u << 3, /* Gang partitioning is static (has > op). */ > + > + /* Explicitly specified loop axes. */ > + OLF_DIM_BASE = 4, > + OLF_DIM_GANG = 1u << (OLF_DIM_BASE + GOMP_DIM_GANG), > + OLF_DIM_WORKER = 1u << (OLF_DIM_BASE + GOMP_DIM_WORKER), > + OLF_DIM_VECTOR = 1u << (OLF_DIM_BASE + GOMP_DIM_VECTOR), > + > + OLF_MAX = OLF_DIM_BASE + GOMP_DIM_MAX > + }; > + > + if (checking) > + { > + if (has_seq && (this_mask || has_auto)) > + error_at (gimple_location (stmt), "%<seq%> overrides other OpenACC loop > specifiers"); > + else if (has_auto && this_mask) > + error_at (gimple_location (stmt), "%<auto%> conflicts with other > OpenACC loop specifiers"); > + > + if (this_mask & outer_mask) > + error_at (gimple_location (stmt), "inner loop uses same OpenACC > parallelism as containing loop"); Too long lines. Plus 2 spaces into one. > + if (check && OMP_CLAUSE_OPERAND (c, 0)) > + error_at (gimple_location (stmt), > + "argument not permitted on %<%s%> clause in" %qs instead of %<%s%> ? > @@ -5769,6 +5885,166 @@ lower_send_shared_vars (gimple_seq *ilis > } > } > > +/* Emit an OpenACC head marker call, encapulating the partitioning and > + other information that must be processed by the target compiler. > + Return the maximum number of dimensions the associated loop might > + be partitioned over. */ > + > +static unsigned > +lower_oacc_head_mark (location_t loc, tree clauses, > + gimple_seq *seq, omp_context *ctx) > +{ > + unsigned levels = 0; > + unsigned tag = 0; > + tree gang_static = NULL_TREE; > + auto_vec<tree, 1> args; If you usually push there 3 or 4 arguments, wouldn't it be better to just use auto_vec<tree, 4> args; instead? > + if (gang_static) > + { > + if (DECL_P (gang_static)) Formatting, too many spaces. > + tree marker = build_int_cst > + (integer_type_node, (head ? IFN_UNIQUE_OACC_HEAD_MARK > + : IFN_UNIQUE_OACC_TAIL_MARK)); I really don't like putting the arguments on a different from the function name, unless you have to. Here you can easily do say enum internal_fn marker_val = head ? IFN_UNIQUE_OACC_HEAD_MARK : IFN_UNIQUE_OACC_TAIL_MARK; tree marker = build_int_cst (integer_type_node, marker_val); same number of lines, easier to read. > + gcall *call = gimple_build_call_internal > + (IFN_UNIQUE, 1 + (tofollow != NULL_TREE), marker, tofollow); Similarly. > + gcall *fork = gimple_build_call_internal > + (IFN_UNIQUE, 2, > + build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_FORK), place); > + gcall *join = gimple_build_call_internal > + (IFN_UNIQUE, 2, > + build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_JOIN), place); Likewise. Just use a tree t = build_int_cst (unsigned_type_node, IFN_UNIQUE_OACC_FORK); gcall *fork = gimple_build_call_internal (IFN_UNIQUE, 2, t, place); etc. > + expr = build2 (TRUNC_MOD_EXPR, ivar_type, ivar, > + fold_convert (ivar_type, collapse->iters)); > + expr = build2 (MULT_EXPR, diff_type, fold_convert (diff_type, expr), > + collapse->step); > + expr = build2 (plus_code, iter_type, collapse->base, > + fold_convert (plus_type, expr)); Shouldn't these be fold_build2 instead? Of course Richi would prefer gimple_build, but omp-low.c has already way too much of fold_build2 + force_gimple_operand_gsi code around that it is fine with me this way. > /* An unduplicable, uncombinable function. Generally used to preserve > a CFG property in the face of jump threading, tail merging or > other such optimizations. The first argument distinguishes > between uses. Other arguments are as needed for use. The return > type depends on use too. */ > DEF_INTERNAL_FN (UNIQUE, ECF_NOTHROW | ECF_LEAF, NULL) > #define IFN_UNIQUE_UNSPEC 0 /* Undifferentiated UNIQUE. */ > + > +/* FORK and JOIN mark the points at which OpenACC partitioned > + execution is entered or exited. They take an INTEGER_CST argument, > + indicating the axis of forking or joining and return nothing. */ > +#define IFN_UNIQUE_OACC_FORK 1 > +#define IFN_UNIQUE_OACC_JOIN 2 > +/* HEAD_MARK and TAIL_MARK are used to demark the sequence entering or > + leaving partitioned execution. */ > +#define IFN_UNIQUE_OACC_HEAD_MARK 3 > +#define IFN_UNIQUE_OACC_TAIL_MARK 4 Shouldn't these be in an enum, to make debugging easier? Jakub