On 9/9/20 2:36 PM, Tobias Burnus wrote:
> Hi Tom,
> 
> On 9/8/20 5:05 PM, Tobias Burnus wrote:
> 
>> On 9/8/20 8:51 AM, Tom de Vries wrote:
>>>     PR target/96964
>>>     * config/nvptx/nvptx.md (define_expand "atomic_test_and_set"): New
>>>     expansion.
>>>     * sync-builtins.def (BUILT_IN_ATOMIC_TEST_AND_SET_1): New builtin.
> 
> I have your patch applied on a current mainline powerpc64le-none-linux-gnu
> + nvptx offloading build.

Thanks for trying this out.

> And I observe the following fails – which seems
> to be new and related to your patch (but I have not confirmed it by
> reverting your libatomic patch).
> 

Could you confirm that?

Meanwhile, I'll try to reproduce on x86_64.

> Required option for the fail: "-O2 -ftracer",
> hence, only the "-O3 ..." testsuite builds fail.
> (-ftracer = "Perform tail duplication to enlarge superblock size.")
> 
> 
> during RTL pass: mach
> asyncwait-1.f90:19: internal compiler error: in nvptx_find_par, at
> config/nvptx/nvptx.c:3293
> 0x10bf9f13 nvptx_find_par
>         gcc/config/nvptx/nvptx.c:3293
> 0x10bf9b97 nvptx_find_par
>         gcc/config/nvptx/nvptx.c:3320
> 0x10bf9b97 nvptx_find_par
>         gcc/config/nvptx/nvptx.c:3320
> ...
> 
> 
> The ICE occurs for the second assert of:
>         case CODE_FOR_nvptx_join:
>           /* A loop tail.  Finish the current loop and return to
>              parent.  */
>           {
>             unsigned mask = UINTVAL (XVECEXP (PATTERN (end), 0, 0));
> 
>             gcc_assert (par->mask == mask);
>             gcc_assert (par->join_block == NULL);
> 
> gdb shows:
> (gdb) p debug_bb(par->join_block )
> (note 213 30 31 24 [bb 24] NOTE_INSN_BASIC_BLOCK)
> (insn 31 213 204 24 (unspec_volatile:SI [
>             (const_int 4 [0x4])
>         ] UNSPECV_JOIN)
> "libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90":24:0 237
> {nvptx_join}
>      (nil))
> (jump_insn 204 31 205 24 (set (pc)
>         (label_ref 198)) 121 {jump}
>      (nil)
>  -> 198)
> 

Yep, code duplication works against the matching of fork/join, it's not
the first time we see this.

Usually the fix is to make an optimization pass conservative with
respect to these fork/join regions, but AFAICT, ftracer already has such
code in ignore_bb_p that tests gimple_call_internal_unique_p.

So, perhaps the ftracer pass is the trigger, but not the pass that does
the problematic transformation? Just a guess at this point.

Thanks,
- Tom

> 
> That affects the testcases:
> libgomp.oacc-fortran/asyncwait-1.f90
> libgomp.oacc-fortran/asyncwait-2.f90
> libgomp.oacc-fortran/asyncwait-3.f90
> libgomp.oacc-fortran/atomic_capture-1.f90
> libgomp.oacc-fortran/atomic_update-1.f90
> libgomp.oacc-fortran/classtypes-1.f95
> libgomp.oacc-fortran/collapse-1.f90
> libgomp.oacc-fortran/collapse-2.f90
> libgomp.oacc-fortran/collapse-3.f90
> libgomp.oacc-fortran/collapse-4.f90
> libgomp.oacc-fortran/collapse-5.f90
> libgomp.oacc-fortran/collapse-6.f90
> libgomp.oacc-fortran/collapse-7.f90
> libgomp.oacc-fortran/collapse-8.f90
> libgomp.oacc-fortran/combined-directives-1.f90
> libgomp.oacc-fortran/combined-reduction.f90
> libgomp.oacc-fortran/common-block-1.f90
> libgomp.oacc-fortran/common-block-2.f90
> libgomp.oacc-fortran/common-block-3.f90
> libgomp.oacc-fortran/deep-copy-1.f90
> libgomp.oacc-fortran/deep-copy-3.f90
> libgomp.oacc-fortran/deep-copy-4.f90
> libgomp.oacc-fortran/deep-copy-5.f90
> libgomp.oacc-fortran/deep-copy-6-no_finalize.F90
> libgomp.oacc-fortran/deep-copy-6.f90
> libgomp.oacc-fortran/deep-copy-7.f90
> libgomp.oacc-fortran/deep-copy-8.f90
> libgomp.oacc-fortran/derived-type-1.f90
> libgomp.oacc-fortran/host_data-2.f90
> libgomp.oacc-fortran/host_data-3.f
> libgomp.oacc-fortran/host_data-4.f90
> libgomp.oacc-fortran/implicit-firstprivate-ref.f90
> libgomp.oacc-fortran/lib-14.f90
> libgomp.oacc-fortran/map-1.f90
> libgomp.oacc-fortran/nested-function-1.f90
> libgomp.oacc-fortran/nested-function-2.f90
> libgomp.oacc-fortran/nested-function-3.f90
> libgomp.oacc-fortran/no_create-3.F90
> libgomp.oacc-fortran/optional-data-copyin.f90
> libgomp.oacc-fortran/optional-data-copyout.f90
> libgomp.oacc-fortran/optional-data-enter-exit.f90
> libgomp.oacc-fortran/optional-declare.f90
> libgomp.oacc-fortran/optional-firstprivate.f90
> libgomp.oacc-fortran/optional-reduction.f90
> libgomp.oacc-fortran/optional-update-device.f90
> libgomp.oacc-fortran/optional-update-host.f90
> libgomp.oacc-fortran/parallel-dims.f90
> libgomp.oacc-fortran/parallel-loop-1.f90
> libgomp.oacc-fortran/pr81352.f90
> libgomp.oacc-fortran/pr84028.f90
> libgomp.oacc-fortran/reduction-1.f90
> libgomp.oacc-fortran/reduction-2.f90
> libgomp.oacc-fortran/reduction-3.f90
> libgomp.oacc-fortran/reduction-4.f90
> libgomp.oacc-fortran/reduction-5.f90
> libgomp.oacc-fortran/reduction-6.f90
> libgomp.oacc-fortran/reduction-7.f90
> libgomp.oacc-fortran/reduction-8.f90
> libgomp.oacc-fortran/routine-1.f90
> libgomp.oacc-fortran/routine-2.f90
> libgomp.oacc-fortran/routine-3.f90
> libgomp.oacc-fortran/routine-4.f90
> libgomp.oacc-fortran/routine-7.f90
> libgomp.oacc-fortran/routine-9.f90
> libgomp.oacc-fortran/subarrays-1.f90
> libgomp.oacc-fortran/subarrays-2.f90
> libgomp.oacc-fortran/update-2.f90
> 
> Tobias
> 
> -----------------
> Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München /
> Germany
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung,
> Alexander Walter

Reply via email to