Hi, When running libgomp test-case reduction-7.c on an nvptx accelerator (T400, driver version 470.86) and GOMP_NVPTX_JIT=-O0, I run into: ... reduction-7.exe:reduction-7.c:312: v_p_2: \ Assertion `out[j * 32 + i] == (i + j) * 2' failed. FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/reduction-7.c \ -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none \ -O0 execution test ...
During investigation I found ptx code like this: ... @ %r163 bra $L262; $L262: ... There's a known problem with executing this type of code, and a workaround is in place to address this: prevent_branch_around_nothing. The workaround does not trigger though because it doesn't handle the nop insn. Fix this by handling the nop insn in prevent_branch_around_nothing. Tested libgomp on x86_64 with nvptx accelerator. Committed to trunk. Thanks, - Tom [nvptx] Handle nop in prevent_branch_around_nothing gcc/ChangeLog: 2022-01-27 Tom de Vries <tdevr...@suse.de> PR target/100428 * config/nvptx/nvptx.cc (prevent_branch_around_nothing): Handle nop insn. --- gcc/config/nvptx/nvptx.cc | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc index ceea4d3a093..262e8f9cc1b 100644 --- a/gcc/config/nvptx/nvptx.cc +++ b/gcc/config/nvptx/nvptx.cc @@ -5103,6 +5103,7 @@ prevent_branch_around_nothing (void) case CODE_FOR_nvptx_forked: case CODE_FOR_nvptx_joining: case CODE_FOR_nvptx_join: + case CODE_FOR_nop: continue; default: seen_label = NULL;