On 03/22/2018 04:11 PM, Cesar Philippidis wrote:
On 03/22/2018 07:23 AM, Tom de Vries wrote:
On 03/02/2018 05:55 PM, Cesar Philippidis wrote:

     (nvptx_declare_function_name): Emit a .maxntid directive hint and
     call nvptx_init_oacc_workers.
+
+  /* Emit a .maxntid hint to help the PTX JIT emit SYNC branches.  */
+  if (lookup_attribute ("omp target entrypoint", DECL_ATTRIBUTES (decl))
+      && lookup_attribute ("oacc function", DECL_ATTRIBUTES (decl)))
+      s << ".maxntid " << cfun->machine->axis_dim[0] << ", "
+    << cfun->machine->axis_dim[1] << ", 1\n";
+
This change:
...
  // BEGIN FUNCTION DEF: main$_omp_fn$0
  .entry main$_omp_fn$0 (.param .u64 %in_ar0)
+  .maxntid 32, 32, 1
...
needs to be an individual patch.
cfun->machine->axis_dims is something new to the vector length changes,
so I hard-coded .maxntid to size '32, 32, 1' for og7 as an interim solution.


That's obviously not good enough.

When I compile this test-case:
...
int
main (void)
{
  int a[10];
#pragma acc parallel num_workers (16)
#pragma acc loop worker
  for (int i = 0; i < 10; i++)
    a[i] = i;

  return 0;
}
...

I get:
...
 .maxntid 32, 16, 1
...

That's the change you need to isolate.

Thanks,
- Tom

Reply via email to