Hi Thomas, hello all,
the attached follow-up patch does:
* It fixes an issue (thinko) related to Fortran and \0 terminated,
which fails for at least substring strings.
* Includes some minor fixes, e.g. ensuring the device is initialized
in omp_get_uid_from_device, the superfluous 'omp_', or adding some
inits to oacc-host.c.
* Now the plugins return NULL instead of failing when the UID cannot
be obtained; in that case, the fallback UID "OMP_DEV_%d" is used.
Comments or remarks before I commit it?
* * *
Regarding the topic of caching in the plugin instead of in
libgomp: If we want to change it, we either to remove the fallback
and require the existence and success of GOMP_OFFLOAD_get_uid.
Otherwise, with host fallback support, we have to cache it at both
locations, which is somehow not really sensible, either.
Thoughts on this topic?
* * *
Longer reply to Thomas' comments:
Thomas Schwinge wrote:
+ "omp_get_uid_from_device",
..., but here without 'omp_' prefix: 'get_uid_from_device' (and properly
sorted).
Ups! Should be of course without. (as 'omp_' prefix is checked before).
Do we apparently not have test suite coverage for these things?
We do *not* test all API routines. The check is, e.g., used in
gfc_error ("%s cannot contain OpenMP API call in intervening code "
or
"OpenMP runtime API call %qD in a region with "
"%<order(concurrent)%> clause", fndecl);
And we have a few tests for each of them, but not a full set of all
API routines.
* * *
+ const char *uid;
Caching this here, instead of acquiring via 'GOMP_OFFLOAD_get_uid' for
each call, is a minor performance optimization? (Similar to other items
cached here, I guess.)
Yes, but it goes a bit beyond: As the pointer is returned to the user, it
has to be allocated at some point - and cached to avoid allocating more
memory when called repeatable called. As the fallback and host
handling is
also done in target.c, the caching is done here.
(Besides the API routines, two env vars and one context selector for
'target_device' support the UID.)
* * *
Please also update 'libgomp/oacc-host.c:host_dispatch'.
Done.
+ ! Note: In gfortran, strings are \0 termined
+ integer(c_int) function omp_get_device_from_uid(uid) bind(C)
For my understanding: in general, Fortran strings are *not*
NUL-terminated, right? So this is a specific properly of 'gfortran'
and/or this GCC/OpenMP interface,
The Fortran standard leaves this to implementation, but by construction,
there is a length (however it is made handled internally, e.g. via the
declaration) and the actual data. - To aid debugging, gfortran NUL
terminates
them.
However, when thinking a bit more about it, taking a substring of a
null-terminated string will not magically be \0 at the boundary of the
substring. - Thus, the simplified approach failed + a Fortran specific
function had to be added (→ fortran.c).
* * *
+ interface omp_get_uid_from_device
+ ! Deviation from OpenMP 6.0: VALUE added.
(..., which I suppose you've reported to OpenMP...)
No - it is not really a bug in the standard. The OpenMP
specification tries to provide a consistent API - but it
is difficult to create an API without touching the ABI.
For the caller side, the usage is the same independent
whether there is an 'intent(in)' or VALUE attribute,
a Bind(C) with or without binding name. Or also a generic
interface with multiple specific ones - which we do to handle
-fdefault-integer-8.
Obviously, the compiler needs to know those details, but
unless users codes the interface themselves instead of
using omp.h / omp_lib.h / the omp_lib module.
Thus, that's one of the few deviation from the OpenMP
specification which does affect the ABI but not the API.
* * *
+GOMP_OFFLOAD_get_uid (int ord)
+{
I guess I'd have just put this code into 'init_hsa_context', filling a
new statically-sized 'uuid' field in 'hsa_context_info' (like
'driver_version_s'; and assuming that 'hsa_context_info' is the right
abstraction for this), and then just return that 'uuid' from
'GOMP_OFFLOAD_get_uid'.
That would be one option. Still, we have to decide whether we either
want to have strictly everything handled in the device code - including
fallback handling (which could be an UID replacement or a fatal error).
Of we do part of the handling elsewhere, e.g. by permitting that the
plugin can fail or does not provide the functions, we can handle it
in target.c (as currently done) - but then we need to cache it there
as well (or at least the fallbacks).
* * *
That way, you'd avoid the unclear semantics of
who gets to 'free' the buffer returned from 'GOMP_OFFLOAD_get_uid' upon
'GOMP_OFFLOAD_fini_device' -- currently the memory is lost?
Well, depends what you mean by lost. The 'devices' data structure in
target.c is allocated early during device initialization and it is
never deallocated. Hence, also the current "uint" member is never
deallocated and remains until the end of the program accessible.
The variable it self has a rather high lifetime as the user might
store it and only much later use it. As a pointer to the string is is
returned to the user, the lifetime has to be rather long.
* * *
But actually, should we just 'return NULL;' in that case, and let
libgomp
proper handle this gracefully, similar to as if 'GOMP_OFFLOAD_get_uid'
isn't implemented at all by a plugin? (What are the expected OpenMP
semantics?)
The spec is silent on it. Expected is that repeatable runs repeatable
such that (at least) for the non-host devices, the same always the
same UID is used for a specific device.
Most sensible is really the UUID, if available, but we can use a
fallback, if needed.
I now added a "return NULL" which clashes a bit with the topic of
caching in the plugin as now the OMP_DEV_%d string also needs to be
cached, but on the libgomp.so side.
* * *
+ sprintf (str, "GPU-%02x" "%02x" "%02x" "%02x"
[…]
+ (unsigned char) s.bytes[0], (unsigned char) s.bytes[1],
BTW: without the (unsigned char), instead of, e.g, "ef" one gets
"ffffffef" which is a bit surprising at a glance.
Tobias