Hi Tobias! On 2024-12-03T11:28:19+0100, Tobias Burnus <tbur...@baylibre.com> wrote: > Thomas Schwinge wrote: >> That's a bug in 'libgomp/plugin/plugin-gcn.c:maybe_init_omp_async' (or >> its users); the real user of ['GOMP_OFFLOAD_openacc_async_construct'] does >> handle the error condition: > > Well, it does not really handle it: It also just calls 'fatal', > admittedly after unlocking but if the program dies it does not > really make a difference.
It makes a very big difference: without unlocking, the process deadlocks. > However, there are known issues with libgomp-plugin calling FATAL, > which does not really trigger the FATAL part in libgomp, cf. > https://gcc.gnu.org/PR109664 Right. > The attached patch now defers the error handling to a bit later, > either still in libgomp-plugin or in one case it even propagates > the error through to a user-called routine. > > Comments before I commit it? Looks good to me, one incremental step forward, thanks! (I see there are more missing-unlocking issues in the GCN 'GOMP_OFFLOAD_openacc_async_construct', and not only there; that's for another day.) Grüße Thomas > plugin/plugin-gcn.c: Fix error handling of > GOMP_OFFLOAD_openacc_async_construct > > Follow up to r15-5392-g884637b6362391. As the name implies, > GOMP_OFFLOAD_openacc_async_construct is also externally called. > Hence, partially revert previous commit to permit unlocking handling > in oacc-async.c's lookup_goacc_asyncqueue by not failing fatally. > > Hence, also the other (indirect) callers had to be updated: > GOMP_OFFLOAD_dev2dev fails now with 'false' and > GOMP_OFFLOAD_async_run fatally. > > libgomp/ChangeLog: > > * plugin/plugin-gcn.c (GOMP_OFFLOAD_dev2dev, GOMP_OFFLOAD_async_run): > Handle omp_async_queue == NULL after call tomaybe_init_omp_async. > (GOMP_OFFLOAD_openacc_async_construct): Use error not fatal error, > partially reverting r15-5392. > > libgomp/plugin/plugin-gcn.c | 14 ++++++++------ > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c > index d26b93657bf..239acf8cb75 100644 > --- a/libgomp/plugin/plugin-gcn.c > +++ b/libgomp/plugin/plugin-gcn.c > @@ -3939,6 +3939,8 @@ GOMP_OFFLOAD_dev2dev (int device, void *dst, const void > *src, size_t n) > { > struct agent_info *agent = get_agent_info (device); > maybe_init_omp_async (agent); > + if (!agent->omp_async_queue) > + return false; > queue_push_copy (agent->omp_async_queue, dst, src, n); > return true; > } > @@ -4350,6 +4352,8 @@ GOMP_OFFLOAD_async_run (int device, void *tgt_fn, void > *tgt_vars, > } > > maybe_init_omp_async (agent); > + if (!agent->omp_async_queue) > + GOMP_PLUGIN_fatal ("Asynchronous queue initialization failed"); > queue_push_launch (agent->omp_async_queue, kernel, tgt_vars, kla); > queue_push_callback (agent->omp_async_queue, > GOMP_PLUGIN_target_task_completion, async_data); > @@ -4388,9 +4392,7 @@ GOMP_OFFLOAD_openacc_async_exec (void (*fn_ptr) (void > *), > gcn_exec (kernel, devaddrs, dims, targ_mem_desc, true, aq); > } > > -/* Create a new asynchronous thread and queue for running future kernels; > - issues a fatal error if the queue cannot be created as all callers expect > - that the queue exists. */ > +/* Create a new asynchronous thread and queue for running future kernels. */ > > struct goacc_asyncqueue * > GOMP_OFFLOAD_openacc_async_construct (int device) > @@ -4418,17 +4420,17 @@ GOMP_OFFLOAD_openacc_async_construct (int device) > > if (pthread_mutex_init (&aq->mutex, NULL)) > { > - GOMP_PLUGIN_fatal ("Failed to initialize a GCN agent queue mutex"); > + GOMP_PLUGIN_error ("Failed to initialize a GCN agent queue mutex"); > return NULL; > } > if (pthread_cond_init (&aq->queue_cond_in, NULL)) > { > - GOMP_PLUGIN_fatal ("Failed to initialize a GCN agent queue cond"); > + GOMP_PLUGIN_error ("Failed to initialize a GCN agent queue cond"); > return NULL; > } > if (pthread_cond_init (&aq->queue_cond_out, NULL)) > { > - GOMP_PLUGIN_fatal ("Failed to initialize a GCN agent queue cond"); > + GOMP_PLUGIN_error ("Failed to initialize a GCN agent queue cond"); > return NULL; > } >