Hi Chung-Lin! On Tue, 25 Sep 2018 21:11:58 +0800, Chung-Lin Tang <chunglin_t...@mentor.com> wrote: > Hi Tom, > this patch removes large portions of plugin/plugin-nvptx.c, since a lot of it > is > now in oacc-async.c now. The new code is essentially a NVPTX/CUDA-specific > implementation > of the new-style goacc_asyncqueues.
> --- a/libgomp/plugin/plugin-nvptx.c > +++ b/libgomp/plugin/plugin-nvptx.c > +struct goacc_asyncqueue * > +GOMP_OFFLOAD_openacc_async_construct (void) > +{ > + struct goacc_asyncqueue *aq > + = GOMP_PLUGIN_malloc (sizeof (struct goacc_asyncqueue)); > + aq->cuda_stream = NULL; > + CUDA_CALL_ASSERT (cuStreamCreate, &aq->cuda_stream, CU_STREAM_DEFAULT); Curiously (this was the same in the code before): does this have to be "CU_STREAM_DEFAULT" instead of "CU_STREAM_NON_BLOCKING", because we want to block anything from running in parallel with "acc_async_sync" GPU kernels, that use the "NULL" stream? (Not asking you to change this now, but I wonder if this is overly strict?) > + if (aq->cuda_stream == NULL) > + GOMP_PLUGIN_fatal ("CUDA stream create NULL\n"); Can this actually happen, given the "CUDA_CALL_ASSERT" usage above? > + CUDA_CALL_ASSERT (cuStreamSynchronize, aq->cuda_stream); Why is the synchronization needed here? > + return aq; > +} Grüße Thomas