On 9/9/20 4:14 PM, Tom de Vries wrote:
> On 9/9/20 3:15 PM, Tom de Vries wrote:
>> On 9/9/20 2:36 PM, Tobias Burnus wrote:
>>> Hi Tom,
>>>
>>> On 9/8/20 5:05 PM, Tobias Burnus wrote:
>>>
>>>> On 9/8/20 8:51 AM, Tom de Vries wrote:
>>>>>     PR target/96964
>>>>>     * config/nvptx/nvptx.md (define_expand "atomic_test_and_set"): New
>>>>>     expansion.
>>>>>     * sync-builtins.def (BUILT_IN_ATOMIC_TEST_AND_SET_1): New builtin.
>>>
>>> I have your patch applied on a current mainline powerpc64le-none-linux-gnu
>>> + nvptx offloading build.
>>
>> Thanks for trying this out.
>>
>>> And I observe the following fails – which seems
>>> to be new and related to your patch (but I have not confirmed it by
>>> reverting your libatomic patch).
>>>
>>
>> Could you confirm that?
>>
>> Meanwhile, I'll try to reproduce on x86_64.
>>
>>> Required option for the fail: "-O2 -ftracer",
>>> hence, only the "-O3 ..." testsuite builds fail.
>>> (-ftracer = "Perform tail duplication to enlarge superblock size.")
>>>
>>>
>>> during RTL pass: mach
>>> asyncwait-1.f90:19: internal compiler error: in nvptx_find_par, at
>>> config/nvptx/nvptx.c:3293
>>> 0x10bf9f13 nvptx_find_par
>>>         gcc/config/nvptx/nvptx.c:3293
>>> 0x10bf9b97 nvptx_find_par
>>>         gcc/config/nvptx/nvptx.c:3320
>>> 0x10bf9b97 nvptx_find_par
>>>         gcc/config/nvptx/nvptx.c:3320
>>> ...
>>>
>>>
>>> The ICE occurs for the second assert of:
>>>         case CODE_FOR_nvptx_join:
>>>           /* A loop tail.  Finish the current loop and return to
>>>              parent.  */
>>>           {
>>>             unsigned mask = UINTVAL (XVECEXP (PATTERN (end), 0, 0));
>>>
>>>             gcc_assert (par->mask == mask);
>>>             gcc_assert (par->join_block == NULL);
>>>
>>> gdb shows:
>>> (gdb) p debug_bb(par->join_block )
>>> (note 213 30 31 24 [bb 24] NOTE_INSN_BASIC_BLOCK)
>>> (insn 31 213 204 24 (unspec_volatile:SI [
>>>             (const_int 4 [0x4])
>>>         ] UNSPECV_JOIN)
>>> "libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90":24:0 237
>>> {nvptx_join}
>>>      (nil))
>>> (jump_insn 204 31 205 24 (set (pc)
>>>         (label_ref 198)) 121 {jump}
>>>      (nil)
>>>  -> 198)
>>>
>>
>> Yep, code duplication works against the matching of fork/join, it's not
>> the first time we see this.
>>
>> Usually the fix is to make an optimization pass conservative with
>> respect to these fork/join regions, but AFAICT, ftracer already has such
>> code in ignore_bb_p that tests gimple_call_internal_unique_p.
>>
>> So, perhaps the ftracer pass is the trigger, but not the pass that does
>> the problematic transformation? Just a guess at this point.
>>
> 
> I can reproduce it, and it's indeed the ftracer pass that does the
> duplication.  So, the question is why doesn't ignore_bb_p work.

Filed PR https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97000 for this.

Thanks,
- Tom

Reply via email to