On 9/9/20 2:36 PM, Tobias Burnus wrote: > Hi Tom, > > On 9/8/20 5:05 PM, Tobias Burnus wrote: > >> On 9/8/20 8:51 AM, Tom de Vries wrote: >>> PR target/96964 >>> * config/nvptx/nvptx.md (define_expand "atomic_test_and_set"): New >>> expansion. >>> * sync-builtins.def (BUILT_IN_ATOMIC_TEST_AND_SET_1): New builtin. > > I have your patch applied on a current mainline powerpc64le-none-linux-gnu > + nvptx offloading build.
Thanks for trying this out. > And I observe the following fails – which seems > to be new and related to your patch (but I have not confirmed it by > reverting your libatomic patch). > Could you confirm that? Meanwhile, I'll try to reproduce on x86_64. > Required option for the fail: "-O2 -ftracer", > hence, only the "-O3 ..." testsuite builds fail. > (-ftracer = "Perform tail duplication to enlarge superblock size.") > > > during RTL pass: mach > asyncwait-1.f90:19: internal compiler error: in nvptx_find_par, at > config/nvptx/nvptx.c:3293 > 0x10bf9f13 nvptx_find_par > gcc/config/nvptx/nvptx.c:3293 > 0x10bf9b97 nvptx_find_par > gcc/config/nvptx/nvptx.c:3320 > 0x10bf9b97 nvptx_find_par > gcc/config/nvptx/nvptx.c:3320 > ... > > > The ICE occurs for the second assert of: > case CODE_FOR_nvptx_join: > /* A loop tail. Finish the current loop and return to > parent. */ > { > unsigned mask = UINTVAL (XVECEXP (PATTERN (end), 0, 0)); > > gcc_assert (par->mask == mask); > gcc_assert (par->join_block == NULL); > > gdb shows: > (gdb) p debug_bb(par->join_block ) > (note 213 30 31 24 [bb 24] NOTE_INSN_BASIC_BLOCK) > (insn 31 213 204 24 (unspec_volatile:SI [ > (const_int 4 [0x4]) > ] UNSPECV_JOIN) > "libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90":24:0 237 > {nvptx_join} > (nil)) > (jump_insn 204 31 205 24 (set (pc) > (label_ref 198)) 121 {jump} > (nil) > -> 198) > Yep, code duplication works against the matching of fork/join, it's not the first time we see this. Usually the fix is to make an optimization pass conservative with respect to these fork/join regions, but AFAICT, ftracer already has such code in ignore_bb_p that tests gimple_call_internal_unique_p. So, perhaps the ftracer pass is the trigger, but not the pass that does the problematic transformation? Just a guess at this point. Thanks, - Tom > > That affects the testcases: > libgomp.oacc-fortran/asyncwait-1.f90 > libgomp.oacc-fortran/asyncwait-2.f90 > libgomp.oacc-fortran/asyncwait-3.f90 > libgomp.oacc-fortran/atomic_capture-1.f90 > libgomp.oacc-fortran/atomic_update-1.f90 > libgomp.oacc-fortran/classtypes-1.f95 > libgomp.oacc-fortran/collapse-1.f90 > libgomp.oacc-fortran/collapse-2.f90 > libgomp.oacc-fortran/collapse-3.f90 > libgomp.oacc-fortran/collapse-4.f90 > libgomp.oacc-fortran/collapse-5.f90 > libgomp.oacc-fortran/collapse-6.f90 > libgomp.oacc-fortran/collapse-7.f90 > libgomp.oacc-fortran/collapse-8.f90 > libgomp.oacc-fortran/combined-directives-1.f90 > libgomp.oacc-fortran/combined-reduction.f90 > libgomp.oacc-fortran/common-block-1.f90 > libgomp.oacc-fortran/common-block-2.f90 > libgomp.oacc-fortran/common-block-3.f90 > libgomp.oacc-fortran/deep-copy-1.f90 > libgomp.oacc-fortran/deep-copy-3.f90 > libgomp.oacc-fortran/deep-copy-4.f90 > libgomp.oacc-fortran/deep-copy-5.f90 > libgomp.oacc-fortran/deep-copy-6-no_finalize.F90 > libgomp.oacc-fortran/deep-copy-6.f90 > libgomp.oacc-fortran/deep-copy-7.f90 > libgomp.oacc-fortran/deep-copy-8.f90 > libgomp.oacc-fortran/derived-type-1.f90 > libgomp.oacc-fortran/host_data-2.f90 > libgomp.oacc-fortran/host_data-3.f > libgomp.oacc-fortran/host_data-4.f90 > libgomp.oacc-fortran/implicit-firstprivate-ref.f90 > libgomp.oacc-fortran/lib-14.f90 > libgomp.oacc-fortran/map-1.f90 > libgomp.oacc-fortran/nested-function-1.f90 > libgomp.oacc-fortran/nested-function-2.f90 > libgomp.oacc-fortran/nested-function-3.f90 > libgomp.oacc-fortran/no_create-3.F90 > libgomp.oacc-fortran/optional-data-copyin.f90 > libgomp.oacc-fortran/optional-data-copyout.f90 > libgomp.oacc-fortran/optional-data-enter-exit.f90 > libgomp.oacc-fortran/optional-declare.f90 > libgomp.oacc-fortran/optional-firstprivate.f90 > libgomp.oacc-fortran/optional-reduction.f90 > libgomp.oacc-fortran/optional-update-device.f90 > libgomp.oacc-fortran/optional-update-host.f90 > libgomp.oacc-fortran/parallel-dims.f90 > libgomp.oacc-fortran/parallel-loop-1.f90 > libgomp.oacc-fortran/pr81352.f90 > libgomp.oacc-fortran/pr84028.f90 > libgomp.oacc-fortran/reduction-1.f90 > libgomp.oacc-fortran/reduction-2.f90 > libgomp.oacc-fortran/reduction-3.f90 > libgomp.oacc-fortran/reduction-4.f90 > libgomp.oacc-fortran/reduction-5.f90 > libgomp.oacc-fortran/reduction-6.f90 > libgomp.oacc-fortran/reduction-7.f90 > libgomp.oacc-fortran/reduction-8.f90 > libgomp.oacc-fortran/routine-1.f90 > libgomp.oacc-fortran/routine-2.f90 > libgomp.oacc-fortran/routine-3.f90 > libgomp.oacc-fortran/routine-4.f90 > libgomp.oacc-fortran/routine-7.f90 > libgomp.oacc-fortran/routine-9.f90 > libgomp.oacc-fortran/subarrays-1.f90 > libgomp.oacc-fortran/subarrays-2.f90 > libgomp.oacc-fortran/update-2.f90 > > Tobias > > ----------------- > Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / > Germany > Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, > Alexander Walter