Hi Sandra,
some observations/comments, but in general it looks good.
Sandra Loosemore wrote:
This patch adds the OMP_METADIRECTIVE tree node and shared tree-level
support for manipulating metadirectives. It defines/exposes
interfaces that will be used in subsequent patches that add front-end
and middle-end support, but nothing generates these nodes yet.
This patch also adds compile-time support for dynamic context
selectors (the target_device selector set and the condition selector
of the user selector set) for metadirectives only. The "declare
variant" directive still supports only static selectors.
...
/* Return 1 if context selector matches the current OpenMP context, 0
if it does not and -1 if it is unknown and need to be determined later.
Some properties can be checked right away during parsing (this routine),
others need to wait until the whole TU is parsed, others need to wait until
- IPA, others until vectorization. */
+ IPA, others until vectorization.
+
+ METADIRECTIVE_P is true if this is a metadirective context, and DELAY_P
+ is true if it's too early in compilation to determine whether some
+ properties match.
+
+ Dynamic properties (which are evaluated at run-time) should always
+ return 1. */
I have to admit that I don't really see the use of metadirective_p as …
int
-omp_context_selector_matches (tree ctx)
+omp_context_selector_matches (tree ctx, bool metadirective_p, bool delay_p)
...
+ if (metadirective_p && delay_p)
+ return -1;
I do see why the resolution of KIND/ARCH/ISA should be delayed – for
both variant/metadirective as long as the code is run by the host and
the device. Except that we could exclude, e.g., 'kind(FPGA)' early on as
we don't support it at all.
But once the device code is split off, I don't see why we can't expand
the DEVICE clause right away for both variant and metadirective – while
for 'target_device', we cannot do much until runtime – except of
excluding things like 'kind(fpga)' – or excluding all 'arch' known not
to be supported neither by the host nor by any enabled offload devices.
Thus, I see why there is a 'delay_p', but not why there is a
'metadirective_p'.
But I might have missed something important ...
case OMP_TRAIT_USER_CONDITION:
if (set == OMP_TRAIT_SET_USER)
for (tree p = OMP_TS_PROPERTIES (ts); p; p = TREE_CHAIN (p))
if (OMP_TP_NAME (p) == NULL_TREE)
{
+ /* OpenMP 5.1 allows non-constant conditions for
+ metadirectives. */
+ if (metadirective_p
+ && !tree_fits_shwi_p (OMP_TP_VALUE (p)))
+ break;
if (integer_zerop (OMP_TP_VALUE (p)))
return 0;
if (integer_nonzerop (OMP_TP_VALUE (p)))
break;
ret = -1;
}
(BTW: I am happy to be enlightened as I likely have miss some fine print.)
Regarding the comment: True, but shouldn't this be handled before by
issuing an error when such a clause is used in 'declare variant', i.e.
only occur when metadirective_p is/can be true?
Besides, I have to admit that I do not understand the new code. The
current code has: constant zero → whole selector known to be false
("return 0"); nonzero constant → keep current state, i.e. either 'true'
(1) or don't known ('-1') and continue; otherwise (not const) → set to
"don't know" (-1) and continue with the next item.
That seems to make also sense for metadirectives. But your patch changes
this to keep current state if a variable. In that case, '1' is used if
this is the only item or the previous condition is true. Or "-1" when
the previous item is "don't know" (-1). - I think that doesn't make
sense and it should always return -1 for a run time value.
Additionally, I wonder why you use tree_fits_shwi_p instead of a simple
'TREE_CODE (OMP_TP_VALUE (p)) != INTEGER_CST'. It does not seem to
matter here, but '(uint128_t)-1' looks like a valid condition and valid
constant, which integer_nonzerop should handled but if the hwi is 128bit
wide, it won't fit into a signed variable.
(As integer_nonzerop and the current code both do "break;" it won't
change the result of the current code.)
* * *
+static tree
+omp_dynamic_cond (tree ctx)
+{
...
+ /* The user condition is not dynamic if it is constant. */
+ if (!tree_fits_shwi_p (TREE_VALUE (expr_list)))
Any reason for using tree_fits_shwi_p instead of INTEGER_CST? Here,
(uint128_t)-1 could make a difference …
+ /* omp_initial_device is -1, omp_invalid_device is -4; choose
+ a value that isn't otherwise defined to indicate the default
+ device. */
+ device_num = build_int_cst (integer_type_node, -2);
Don't do this - we do it differently for 'target' and it should do the
same. Some value usage history:
For target / target data etc, GCC historically used -1 to denote for
that no device clause was specified (→ default device) and >= 0 for a
user-specified device, i.e. "GOMP_target_ext(device_num,..." gets a "-1"
if nothing was specified (= default device).
OpenMP 5.2 then introduced omp_{initial,invalid}_device with
omp_initial_device == -1 and the other one < -1 but implementation defined.
Thus, we have 0...num_devices + -4 (omp_invalid_device) and "-1" for
both default device (GOMP_target_ext etc.) and as host/initial device
(API routines omp_...).
Solution was: For the GOMP_... functions, keep using -1 = default device
and use -2 for omp_initial_device. (And for the API routines, -1 ==
initial device).
If you look at the device number handling in libgomp, functions which
can be called in both context have a "remap" boolean to handle the two
usages for -1.
I strongly suggest to use the same semantic here to avoid confusion,
i.e. -1 = nothing specified, use default device. And map a
user-specified omp_initial_device (-1) to a -2.
And it would help to use GOMP_DEVICE_HOST_FALLBACK (-2, host/initial
device) and GOMP_DEVICE_ICV (-1, default device) where appropriate as
they are a tiny bit more readable and more greppable.
For the conversion, have a look at omp-expand.cc's expand_omp_target and
search for both OMP_CLAUSE_DEVICE (2nd hit) and need_device_adjustment;
if the latter is true: device = (cond ? device :
GOMP_DEVICE_HOST_FALLBACK) [where cond is 'd == -1' or rather
'(unsigned) d > (unsigned)-2']
It might make sense to move (part of) this code out of expand_omp_target
and share it by both; at least the gimple code is rather complex. And I
guess your code also runs in later phase of the compiler and, hence,
also cannot use simple code.
* * *
@@ -4019,6 +4019,40 @@ dump_generic_node (pretty_printer *pp, tree node, int
spc, dump_flags_t flags,
...
+ if (selector == NULL_TREE)
+ pp_string (pp, "default:");
I wonder whether we should be forward looking (OpenMP >= 5.2) and dump
'otherwise:'. (OpenMP <= 5.1 + [deprecated] 5.2 have 'default'.)
Tobias