Re: [PATCH 6/6, OpenACC, libgomp] Async re-work, nvptx changes

Thomas Schwinge Thu, 06 Dec 2018 12:58:34 -0800

Hi Chung-Lin!

On Tue, 25 Sep 2018 21:11:58 +0800, Chung-Lin Tang <chunglin_t...@mentor.com> 
wrote:
> Hi Tom,
> this patch removes large portions of plugin/plugin-nvptx.c, since a lot of it 
> is
> now in oacc-async.c now. The new code is essentially a NVPTX/CUDA-specific 
> implementation
> of the new-style goacc_asyncqueues.


> --- a/libgomp/plugin/plugin-nvptx.c
> +++ b/libgomp/plugin/plugin-nvptx.c

> +struct goacc_asyncqueue *
> +GOMP_OFFLOAD_openacc_async_construct (void)
> +{
> +  struct goacc_asyncqueue *aq
> +    = GOMP_PLUGIN_malloc (sizeof (struct goacc_asyncqueue));
> +  aq->cuda_stream = NULL;
> +  CUDA_CALL_ASSERT (cuStreamCreate, &aq->cuda_stream, CU_STREAM_DEFAULT);

Curiously (this was the same in the code before): does this have to be
"CU_STREAM_DEFAULT" instead of "CU_STREAM_NON_BLOCKING", because we want
to block anything from running in parallel with "acc_async_sync" GPU
kernels, that use the "NULL" stream?  (Not asking you to change this now,
but I wonder if this is overly strict?)

> +  if (aq->cuda_stream == NULL)
> +    GOMP_PLUGIN_fatal ("CUDA stream create NULL\n");

Can this actually happen, given the "CUDA_CALL_ASSERT" usage above?

> +  CUDA_CALL_ASSERT (cuStreamSynchronize, aq->cuda_stream);

Why is the synchronization needed here?

> +  return aq;
> +}


Grüße
 Thomas

Re: [PATCH 6/6, OpenACC, libgomp] Async re-work, nvptx changes

Reply via email to