https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93030

            Bug ID: 93030
           Summary: [OpenACC] libgomp.oacc-c-c++-common/deep-copy-10.c
                    FAILS on AMDGCN – invalid 'async' usage?
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libgomp
          Assignee: unassigned at gcc dot gnu.org
          Reporter: burnus at gcc dot gnu.org
                CC: ams at gcc dot gnu.org, jakub at gcc dot gnu.org, jules at 
gcc dot gnu.org,
                    tschwinge at gcc dot gnu.org
  Target Milestone: ---

This test case was added as part of the deep-copy/attach/detach support for
OpenACC 1. 6(r279620 to r279631),
https://gcc.gnu.org/ml/gcc-patches/2019-12/threads.html#01247


Running libgomp.oacc-c-c++-common/deep-copy-10.c works on nvptx.

However, it fails with amdgcn with:

Memory access fault by GPU node-2 (Agent handle: 0x6bfe20) on address 0x685000.
Reason: Page not present or supervisor privilege.


It works if one disables the 'async' clauses. I vaguely recall that nvptx does
no real async while amdgcn does.

In any case, I am not completely sure how the attaching/detaching works if one
runs this concurrently:

  for (int i = 0; i < 99; i++)
    {
      int j;
#pragma acc parallel loop copy(m.a[0:N]) async(i % 2)
      for (j = 0; j < N; j++)
        m.a[j]++;

Is this guaranteed to work? (It might if the increment is atomic and the
copyin/out works with locking and ref counts. But I can see plenty of ways this
can go wrong.)

Reply via email to