Hello
Thanks for your reply.
On 26/07/2021 3:29 pm, Jakub Jelinek wrote:
On Fri, Jul 09, 2021 at 12:16:15PM +0100, Kwok Cheung Yeung wrote:
3) In the OpenMP examples (version 5.0.1), section 9.7, the example
metadirective.3.c does not work as expected.
#pragma omp declare target
void exp_pi_diff(double *d, double my_pi){
#pragma omp metadirective \
when( construct={target}: distribute parallel for ) \
default( parallel for simd)
...
int main()
{
...
#pragma omp target teams map(tofrom: d[0:N])
exp_pi_diff(d,my_pi);
...
exp_pi_diff(d,my_pi);
The spec says in this case that the target construct is added to the
construct set because of the function appearing in between omp declare target
and omp end declare target, so the above is something that resolves
statically to distribute parallel for.
It is true that in OpenMP 5.1 the earlier
For functions within a declare target block, the target trait is added to the
beginning of the
set as c 1 for any versions of the function that are generated for target
regions so the total size
of the set is increased by 1.
has been mistakenly replaced with:
For device routines, the target trait is added to the beginning of the set as c
1 for any versions of
the procedure that are generated for target regions so the total size of the
set is increased by 1.
by that has been corrected in 5.2:
C/C++:
For functions that are declared in a code region that is delimited by a declare
target directive and
its paired end directive, the target trait is added to the beginning of the set
as c 1 for any target
variants that result from the directive so the total size of the set is
increased by one.
Fortran:
If a declare target directive appears in the specification part of a procedure
or in the
specification part of a procedure interface body, the target trait is added to
the beginning of the
set as c 1 for any target variants that result from the directive so the total
size of the set is
increased by one.
So, it is really a static decision that can be decided already during
parsing.
In Section 1.2.2 of the OpenMP TR10 spec, 'target variant' is defined as:
A version of a device routine that can only be executed as part of a target
region.
So isn't this really saying the same thing as the previous versions of the spec?
The target trait is added to the beginning of the construct set _for any target
variants_ that result from the directive (implying that it shouldn't be added
for non-target variants). In this example, the same function exp_pi_diff is
being used in both a target and non-target context, so shouldn't the
metadirective resolve differently in the two contexts, independently of the
function being declared in a 'declare target' block? If not, there does not seem
to be much point in that example (in section 9.7 of the OpenMP Examples v5.0.1).
From reading the spec, I infer that they expect the device and non-device
versions of a function with 'declare target' to be separate, but that is not
currently the case for GCC - on the host compiler, the same version of the
function gets called in both target and non-target regions (though in the target
region case, it gets called indirectly via a compiler-generated function with a
name like main._omp_fn.0). The offload compiler gets its own streamed version,
so there is no conflict there - by definition, its version must be in a target
context.
Thanks,
Kwok