Hello

Thanks for your reply.

On 26/07/2021 3:29 pm, Jakub Jelinek wrote:
On Fri, Jul 09, 2021 at 12:16:15PM +0100, Kwok Cheung Yeung wrote:
3) In the OpenMP examples (version 5.0.1), section 9.7, the example
metadirective.3.c does not work as expected.

#pragma omp declare target
void exp_pi_diff(double *d, double my_pi){
    #pragma omp metadirective \
                when( construct={target}: distribute parallel for ) \
                default( parallel for simd)
...
int main()
{
    ...
    #pragma omp target teams map(tofrom: d[0:N])
    exp_pi_diff(d,my_pi);
    ...
    exp_pi_diff(d,my_pi);

The spec says in this case that the target construct is added to the
construct set because of the function appearing in between omp declare target
and omp end declare target, so the above is something that resolves
statically to distribute parallel for.
It is true that in OpenMP 5.1 the earlier
For functions within a declare target block, the target trait is added to the 
beginning of the
set as c 1 for any versions of the function that are generated for target 
regions so the total size
of the set is increased by 1.
has been mistakenly replaced with:
For device routines, the target trait is added to the beginning of the set as c 
1 for any versions of
the procedure that are generated for target regions so the total size of the 
set is increased by 1.
by that has been corrected in 5.2:
C/C++:
For functions that are declared in a code region that is delimited by a declare 
target directive and
its paired end directive, the target trait is added to the beginning of the set 
as c 1 for any target
variants that result from the directive so the total size of the set is 
increased by one.
Fortran:
If a declare target directive appears in the specification part of a procedure 
or in the
specification part of a procedure interface body, the target trait is added to 
the beginning of the
set as c 1 for any target variants that result from the directive so the total 
size of the set is
increased by one.

So, it is really a static decision that can be decided already during
parsing.

In Section 1.2.2 of the OpenMP TR10 spec, 'target variant' is defined as:

A version of a device routine that can only be executed as part of a target 
region.

So isn't this really saying the same thing as the previous versions of the spec? The target trait is added to the beginning of the construct set _for any target variants_ that result from the directive (implying that it shouldn't be added for non-target variants). In this example, the same function exp_pi_diff is being used in both a target and non-target context, so shouldn't the metadirective resolve differently in the two contexts, independently of the function being declared in a 'declare target' block? If not, there does not seem to be much point in that example (in section 9.7 of the OpenMP Examples v5.0.1).

From reading the spec, I infer that they expect the device and non-device versions of a function with 'declare target' to be separate, but that is not currently the case for GCC - on the host compiler, the same version of the function gets called in both target and non-target regions (though in the target region case, it gets called indirectly via a compiler-generated function with a name like main._omp_fn.0). The offload compiler gets its own streamed version, so there is no conflict there - by definition, its version must be in a target context.

Thanks,

Kwok

Reply via email to