Hi Tobias!
On 2024-02-19T22:36:51+0100, Tobias Burnus <[email protected]> wrote:
> While waiting for some testing to finish, I got distracted and added the
> very low hanging OpenACC 3.3 fruits, i.e. those Fortran routines that directly
> map to their C counter part.
>
> Comments, remarks?
Thanks, that largely looks straight-forward. I've not done an in-depth
review, just a few comments. Resolve these as you think is necessary,
and then 'git push'.
I don't know much about Fortran interfaces -- I trust you've got that
under control. ;-)
Thanks for the test cases. Would be nice to have test cases covering all
interfaces -- but I don't think we're currently complete in that regard,
so shall not hold your contribution to higher standards.
> OpenACC: Add Fortran routines
> acc_{alloc,free,hostptr,deviceptr,memcpy_{to,from}_device*}
>
> These routines map simply to the C counterpart and are meanwhile
> defined in OpenACC 3.3. (There are additional routine changes,
> including the Fortran addition of acc_attach/acc_detach, that
> require more work than a simple addition of an interface and
> are therefore excluded.)
I saw:
- <https://gcc.gnu.org/PR113997> "Bogus 'Warning: Interface mismatch in
global procedure' with C binding"
- <https://gcc.gnu.org/PR114002> "[OpenACC][OpenACC 3.3] Add
'acc_attach'/'acc_detach' routine"
> --- a/libgomp/libgomp.texi
> +++ b/libgomp/libgomp.texi
> @section @code{acc_malloc} -- Allocate device memory.
> @table @asis
> @item @emph{Description}
> -This function allocates @var{len} bytes of device memory. It returns
> +This function allocates @var{bytes} of device memory. It returns
Not '@var{bytes} {+bytes+}' or similar?
> @section @code{acc_memcpy_to_device} -- Copy host memory to device memory.
> @item @emph{C/C++}:
> @multitable @columnfractions .20 .80
> -@item @emph{Prototype}: @tab @code{acc_memcpy_to_device(d_void *dest, h_void
> *src, size_t bytes);}
> +@item @emph{Prototype}: @tab @code{void acc_memcpy_to_device(d_void*
> data_dev_dest,}
> +@item @tab @code{h_void* data_host_src, size_t bytes);}
> +@item @emph{Prototype}: @tab @code{void acc_memcpy_to_device_async(d_void*
> data_dev_dest,}
> +@item @tab @code{h_void* data_host_src, size_t bytes, int
> async_arg);}
> +@end multitable
> +
> +@item @emph{Fortran}:
> +@multitable @columnfractions .20 .80
> +@item @emph{Interface}: @tab @code{subroutine
> acc_memcpy_to_device(data_dev_dest, &}
> +@item @tab @code{data_host_src, bytes)}
> +@item @emph{Interface}: @tab @code{subroutine
> acc_memcpy_to_device_async(data_dev_dest, &}
> +@item @tab @code{data_host_src, bytes, async_arg)}
> +@item @tab @code{type(c_ptr), value :: data_dev_dest}
> +@item @tab @code{type(*), dimension(*) :: data_host_src}
> +@item @tab @code{integer(c_size_t), value :: bytes}
> +@item @tab @code{integer(acc_handle_kind), value ::
> async_arg}
> @end multitable
I did wonder whether we should (here, and elsewhere) also update the
'@menu' in "OpenACC Runtime Library Routines" to list the 'async'
routines -- but the OpenACC specification also doesn't, so it shall be
fine as is here, too.
> @item @emph{Reference}:
> @uref{https://www.openacc.org, OpenACC specification v2.6}, section
> -3.2.31.
> +3.2.31 @uref{https://www.openacc.org, OpenACC specification v3.3}, section
(Fine as is, of course, but could -- generally -- simplify the 'diff' by
starting the new '@uref' on its own line.)
> +3.2.26..
Double '.'.
> --- a/libgomp/openacc.f90
> +++ b/libgomp/openacc.f90
> @@ -758,6 +758,93 @@ module openacc_internal
> integer (c_int), value :: async
> end subroutine
> end interface
> +
> + interface
> + type(c_ptr) function acc_malloc (bytes) bind(C)
> +[...]
> + end subroutine
> + end interface
> end module openacc_internal
Assuming that 'module openacc_internal' currently is sorted per
appearance in the OpenACC specification (?), I suggest we continue to do
so. (..., like in 'openacc_lib.h', too.)
> @@ -794,6 +881,9 @@ module openacc
> public :: acc_copyin_async, acc_create_async, acc_copyout_async
> public :: acc_delete_async, acc_update_device_async, acc_update_self_async
> public :: acc_copyout_finalize, acc_delete_finalize
> + public :: acc_malloc, acc_free, acc_map_data, acc_unmap_data, acc_deviceptr
> + public :: acc_hostptr, acc_memcpy_to_device, acc_memcpy_to_device_async
> + public :: acc_memcpy_from_device, acc_memcpy_from_device_async
Likewise.
> @@ -871,9 +961,6 @@ module openacc
> procedure :: acc_on_device_h
> end interface
>
> - ! acc_malloc: Only available in C/C++
> - ! acc_free: Only available in C/C++
> -
> ! As vendor extension, the following code supports both 32bit and 64bit
> ! arguments for "size"; the OpenACC standard only permits default-kind
> ! integers, which are of kind 4 (i.e. 32 bits).
> @@ -953,20 +1040,12 @@ module openacc
> procedure :: acc_update_self_array_h
> end interface
>
> - ! acc_map_data: Only available in C/C++
> - ! acc_unmap_data: Only available in C/C++
> - ! acc_deviceptr: Only available in C/C++
> - ! acc_hostptr: Only available in C/C++
> -
> interface acc_is_present
> procedure :: acc_is_present_32_h
> procedure :: acc_is_present_64_h
> procedure :: acc_is_present_array_h
> end interface
>
> - ! acc_memcpy_to_device: Only available in C/C++
> - ! acc_memcpy_from_device: Only available in C/C++
> -
> interface acc_copyin_async
> procedure :: acc_copyin_async_32_h
> procedure :: acc_copyin_async_64_h
Is that now a different style that we're not listing the new interfaces
in 'module openacc' here?
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.fortran/acc_host_device_ptr.f90
> @@ -0,0 +1,43 @@
> +! { dg-do run }
> +! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } }
> +
> +! Fortran version of libgomp.oacc-c-c++-common/lib-59.c
I like to also put a cross reference into the originating C/C++ test
case, so that anyone adjusting either one also is aware that another one
may need adjusting, too.
> + ! The following assumes sizeof(void*) being the same on host and device:
That's generally required anyway.
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/acc-memcpy.f90
> @@ -0,0 +1,47 @@
> +! { dg-do run }
> +! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } }
> +
> +! based on libgomp.oacc-c-c++-common/lib-60.c
Likewise.
Grüße
Thomas