Issue |
143887
|
Summary |
[flang][OpenMP] Parallel do in declare target subroutine does not work
|
Labels |
flang
|
Assignees |
|
Reporter |
hakostra
|
I have `flang` from a recent git commit:
```
$ flang --version
flang version 21.0.0git (https://github.com/llvm/llvm-project.git 40cc7b4578fd2d65aaef8356fbe7caf2d84a8f3e)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/llvm/llvm-project/install/bin
```
I believe the following example is valid OpenMP code:
```
PROGRAM reproducer
IMPLICIT NONE
REAL, ALLOCATABLE, TARGET, DIMENSION(:) :: arr
INTEGER, PARAMETER :: ngrids = 2
INTEGER, PARAMETER :: cellsperdim = 4
INTEGER, PARAMETER :: cellspergrid = cellsperdim**3
INTEGER :: iouter, ip3
ALLOCATE(arr(ngrids * cellspergrid), source=-1.0)
!$omp target teams distribute private(ip3) map(tofrom: arr)
DO iouter = 1, ngrids
ip3 = (iouter - 1) * cellspergrid + 1
CALL kernel(arr(ip3))
END DO
!$omp end target teams distribute
PRINT *, arr
DEALLOCATE(arr)
CONTAINS
SUBROUTINE kernel(gridarr)
!$omp declare target
! Subroutine arguments
REAL, INTENT(INOUT), DIMENSION(cellsperdim, cellsperdim, cellsperdim) :: gridarr
! Local variables
INTEGER :: i, j, k
!$omp parallel do collapse(2) private(i, j, k) shared(gridarr)
DO i = 1, cellsperdim
DO j = 1, cellsperdim
DO k = 1, cellsperdim
gridarr(k, j, i) = REAL(i)
END DO
END DO
END DO
!$omp end parallel do
END SUBROUTINE kernel
END PROGRAM reproducer
```
This program *works* with both `gfortran` and Cray `ftn`. I can compile it just fine with flang: `flang -O3 -fopenmp -fopenmp-version=52 -fopenmp-targets=nvptx64 flang-omp-device-bug.F90` but running it with mandatory offloading gives incorrect results:
```
$ OMP_TARGET_OFFLOAD=mandatory ./a.out
' -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1.
-1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1.
-1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1.
-1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1.
-1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1.
-1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1.
-1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1.
```
Disabling offloading gives the expected results:
```
$ OMP_TARGET_OFFLOAD=disabled ./a.out
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2.
2. 2. 2. 2. 2. 2. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 4. 4. 4. 4.
4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 2. 3. 3. 3. 3. 3. 3. 3. 3.
3. 3. 3. 3. 3. 3. 3. 3. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4. 4.
```
I created an equivalent example in C, and both `gcc`, `clang` (from the same git commit as `flang`) and Cray `cc` run that example just fine both with and without offloading.
For offloading I have an Nvidia RTX 4080 Super, except with Cray where I am on a shared HPC system.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs