Hi PA,
as discussed off list, I was stumbling over the call to GOMP_task. I now
understand why: I was looking at a different version of the OpenMP spec.
Namely, OpenMP 5.2 contains the changes for spec Issue 2741 "dispatch
construct data scoping issues". Namely: Performance issue due to 'task'
compared to direct call, effect of unintended firstprivatization, …
The currrent version has
(a) nowait
"The addition of the *nowait* element to the semantic requirement set by
the *dispatch* directive has no effect on the dispatch construct apart
from the effect it may have on the arguments that are passed when
calling a function variant." (I assume the latter is about 'append_args'
of interop objects)
(b) depend
"If the *dispatch* directive adds one or more _depend_ element to the
semantic requirement set, and those element are not removed by the
effect of a declare variant directive, the behavior is as if those
properties were applied as *depend* clauses to a *taskwait* construct
that is executed before the *dispatch* region is executed."
I think it would good to match the 5.2 behavior.
* * *
I have not fully checked whether the 'device' routine is properly
handled. The current wording states:
"If the device clause is present, the value of the default-device-var
ICV is set to the value of the expression in the clause on entry to the
dispatch region and is restored to its previous value at the end of the
region."
For the code itself, it seems to be handled correctly, see attached
testcase (consider including).
I was wondering (and haven't checked) whether the ICV is set for too
much (i.e. not only the "data environment" (i.e.
"The variables associated with the execution of a given region"), but is
also imminently visible by other concurrently running threads outside of
that region).
Can you check. (Albeit, my question might also be answered once I finish
reading the patch …)
Thanks,
Tobias
#include <omp.h>
int f ()
{
return omp_get_default_device ();
}
int main ()
{
for (int d = omp_initial_device; d <= omp_get_num_devices (); d++)
{
int dev = omp_invalid_device;
omp_set_default_device (d);
#pragma omp dispatch
dev = f ();
if (d == omp_initial_device || d == omp_get_num_devices ())
{
if (dev != omp_initial_device && dev != omp_get_num_devices ())
__builtin_abort ();
if (omp_get_default_device() != omp_initial_device
&& omp_get_default_device() != omp_get_num_devices ())
__builtin_abort ();
}
else
if (dev != d || d != omp_get_default_device())
__builtin_abort ();
for (int d2 = omp_initial_device; d2 <= omp_get_num_devices (); d2++)
{
dev = omp_invalid_device;
#pragma omp dispatch device(d2)
dev = f ();
if (d == omp_initial_device || d == omp_get_num_devices ())
{
if (omp_get_default_device() != omp_initial_device
&& omp_get_default_device() != omp_get_num_devices ())
__builtin_abort ();
}
else if (d != omp_get_default_device())
__builtin_abort ();
if (d2 == omp_initial_device || d2 == omp_get_num_devices ())
{
if (dev != omp_initial_device && dev != omp_get_num_devices ())
__builtin_abort ();
}
else if (dev != d2)
__builtin_abort ();
}
}
return 0;
}