On 20/01/2026 13:17, Tobias Burnus wrote:
* * *
Likewise for the second target region, where GCC does not
like the 'present' either. Using
'alloc: ... density0'
'always, to: density1'
it fails differently:
libgomp: cuCtxSynchronize error: an illegal memory access was encountered
However, with
'to: density0'
'always, to: density1'
the program compiles and runs past this target region.
However, at runtime, 'from:' in target exit data doesn't bring the data back
for 'density1' (but for density0) - while 'always, from' (for density1)
will cause:
libgomp: cuCtxSynchronize error: an illegal memory access was encountered
Again, this message is a bit surprising - while the failing copy back seems
to be due to 'density1' not being in the present table, I'd guess.
Here is my understanding of what GCC and libgomp currently handle this
case in turn:
1) 'target enter data'
a) maps both the descriptor and the data of density0 as it is
allocated and initialised - both refcounts are set to 1;
b) maps the descriptor of density1 but does not create storage on the
device for its data since it is still unallocated - the descriptor's
refcount is set to 1.
2) The first 'target map' runs fine because the *data* of density0 is in
the present table.
3) The second 'target map'
a) works fine for density0 as above;
b) fails for density1 because its data is not in the present table,
even though its descriptor is.
4) Assuming the present modifier is stripped and 3) does not fail,
'target exit data':
a) transfers the data of density0 back to the host;
b) upon exit from 3), the refcount of density1's descriptor is still
1 but that of its data is 0 so it gets unmapped without a chance for the
dev2host transfer to happen.
So we have two issues: the present modifier and the unmapping.
For the former, I would suggest to apply the present modifier to the
array descriptor rather than its data.
For the latter, it is not clear to me whether the OpenMP spec mandates
that both the array descriptor and the data share the same refcount or not.
As Cray ftn shares the not-in-present-table behavior for the scalar
case, it is not surprising that it also uses the host value for 'density1'.
But it doesn't have the odd crash GCC has. It behaves identical for
'always, from' and 'from', contrary to GCC:
-------------------------
module m
implicit none
type field_type
real(kind=8), allocatable :: density0(:,:), density1(:,:)
end type field_type
type tile_type
type(field_type) :: field
end type tile_type
type chunk_type
real(kind=8), allocatable :: left_rcv_buffer(:)
type(tile_type), allocatable :: tiles(:)
end type chunk_type
type(chunk_type) :: chunk
end
use m
implicit none
allocate(chunk%tiles(1))
chunk%tiles(1)%field%density0 = reshape([1,2,3,4],[2,2])
!$omp target enter data &
!$omp map(to: chunk%tiles(1)%field%density0) &
!$omp map(to: chunk%tiles(1)%field%density1)
!$omp target map(present, alloc: chunk%tiles(1)%field%density0)
! if (.not. allocated(chunk%tiles(1)%field%density0)) stop 1
! if (any (chunk%tiles(1)%field%density0 /= reshape([1,2,3,4],[2,2]))) stop 1
chunk%tiles(1)%field%density0 = chunk%tiles(1)%field%density0 * 2
!$omp end target
chunk%tiles(1)%field%density1 = reshape([11,22,33,44],[2,2])
!$omp target map(present, alloc: chunk%tiles(1)%field%density0) &
!$omp map(always, present, to: chunk%tiles(1)%field%density1)
! if (.not. allocated(chunk%tiles(1)%field%density0)) stop 1
! if (.not. allocated(chunk%tiles(1)%field%density1)) stop 1
! if (any (chunk%tiles(1)%field%density0 /= 2*reshape([1,2,3,4],[2,2]))) stop 1
! if (any (chunk%tiles(1)%field%density1 /= reshape([11,22,33,44],[2,2])))
stop 1
chunk%tiles(1)%field%density0 = chunk%tiles(1)%field%density0 * 7
chunk%tiles(1)%field%density1 = chunk%tiles(1)%field%density1 * 3
!$omp end target
!$omp target exit data &
!$omp map(from: chunk%tiles(1)%field%density0) &
!$omp map(from: chunk%tiles(1)%field%density1)
print *, chunk%tiles(1)%field%density0
print *, chunk%tiles(1)%field%density1
if (any (chunk%tiles(1)%field%density0 /= 7*2*reshape([1,2,3,4],[2,2]))) stop 1
if (any (chunk%tiles(1)%field%density1 /= 3*reshape([11,22,33,44],[2,2]))) stop
2
end
-------------------------
--
PA