https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121664
Bug ID: 121664 Summary: [Nvptx][OpenMP] 'omp target ... simd' with 'collapse' – leads to illegal memory access Product: gcc Version: 16.0 Status: UNCONFIRMED Keywords: openmp, wrong-code Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: tschwinge at gcc dot gnu.org Depends on: 121453 Target Milestone: --- Created attachment 62196 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62196&action=edit test.f90 — compile for Nvptx offloading with "gfortran -fopenmp -O2 -fno-tree-loop-vectorize" +++ This bug was initially created as a clone of Bug #121453 +++ This shows up with the SPECaccel v2023 testcase '455.seismic' by failing with nvptx offload as: libgomp: cuCtxSynchronize error: an illegal memory access was encountered or in the debugger: CUDA Exception: Warp Out-of-range Address For the big program, it occurs with 'src.alt/omp_target' for the first loop: !$omp target teams distribute parallel do simd collapse(3) with -O2 or -O3. However, it starts to pass when reducing it, but using -O2 -fno-tree-loop-vectorize still makes it fail with the attached simplified testcase. NOTE: It works on the host or with AMD GPU (gfx90a) offloading, while it fails with an sm_70 and sm_86 Nvidia GPU. Passing GOMP_NVPTX_JIT=-O0 or GOMP_NVPTX_JIT=-O2 does not change the result. And it happens every time, using both 12.2 and 13.0. It does not seem to be a regression. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121453 [Bug 121453] [OpenMP] 'omp simd' with 'collapse' – variable '.count' uninitialized, but used as 'if (.iter.14 == .count.15)'