On 03/22/2018 04:11 PM, Cesar Philippidis wrote:
On 03/22/2018 07:23 AM, Tom de Vries wrote:
On 03/02/2018 05:55 PM, Cesar Philippidis wrote:
(nvptx_declare_function_name): Emit a .maxntid directive hint and
call nvptx_init_oacc_workers.
+
+ /* Emit a .maxntid hint to help the PTX JIT emit SYNC branches. */
+ if (lookup_attribute ("omp target entrypoint", DECL_ATTRIBUTES (decl))
+ && lookup_attribute ("oacc function", DECL_ATTRIBUTES (decl)))
+ s << ".maxntid " << cfun->machine->axis_dim[0] << ", "
+ << cfun->machine->axis_dim[1] << ", 1\n";
+
This change:
...
// BEGIN FUNCTION DEF: main$_omp_fn$0
.entry main$_omp_fn$0 (.param .u64 %in_ar0)
+ .maxntid 32, 32, 1
...
needs to be an individual patch.
cfun->machine->axis_dims is something new to the vector length changes,
so I hard-coded .maxntid to size '32, 32, 1' for og7 as an interim solution.
That's obviously not good enough.
When I compile this test-case:
...
int
main (void)
{
int a[10];
#pragma acc parallel num_workers (16)
#pragma acc loop worker
for (int i = 0; i < 10; i++)
a[i] = i;
return 0;
}
...
I get:
...
.maxntid 32, 16, 1
...
That's the change you need to isolate.
Thanks,
- Tom