Re: [RFC] Offloading Support in libgomp

Torvald Riegel Wed, 28 Aug 2013 05:13:19 -0700

On Wed, 2013-08-28 at 13:21 +0200, Richard Biener wrote:
> On Wed, Aug 28, 2013 at 1:06 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> > On Wed, Aug 28, 2013 at 12:39:00PM +0200, Richard Biener wrote:
> >> >From the accelerator BOF video I gather we agreed on using the GOMP
> >> representation as unified middle-end.  What I didn't get is whether we
> >> agreed on libgomp being the unified single runtime (that eventually
> >> dispatches to accelerator specific runtimes, opened via dlopen)?
> >
> > I guess that is up to discussions.  It can be e.g. that libgomp library
> > dlopens libgomp specific plugins, or that those plugins are written to be
> > usable by more libraries (libopenacc, etc.), or some code for those plugins
> > is shared.
> > Important is also what target "libraries" we actually provide, e.g. OpenMP
> > 4.0 says basically that from target code you can only call code declared
> > or defined in #pragma omp declare target ... #pragma omp end declare target
> > region, but it pretty much assumes that you can use various omp_* library
> > calls, various #pragma omp ... directives (which probably need some library
> > implementation) and stuff like printf and various math library functions.
> 
> My thought was that we need to have control over scheduling and thus have
> a single runtime to be able to execute the following in parallel on the
> accelerator and the CPU:
> 
> #pragma omp parallel
> {
> #pragma omp target
>    for (;;)
>      ...
> #pragma omp for
>   for (;;)
>      ...
> }
> #pragma omp wait
> 
> that is, the omp target dispatch may not block the CPU.


And that's not the only combination we have to consider.  ISO C++ will
come up with something eventually (both for parallelism and likely as
well for concurrency), and ISO C has a study group (CPLEX) looking at
Cilk with some OpenMP mixed in.  So we will have different programming
abstractions (OpenMP, some Cilk-like, perhaps some kind of lightweight
threads for concurrency, ...) to support, and currently they all use
different schedulers.  There are conversations going on in the
respective ISO C++ and C study groups about how to tame the scheduler
side of this, but nothing tangible has emerged from that so far.

> I can hardly
> see how you can make multiple runtimes co-exist from the GCC code
> generation side.

Perhaps having several runtimes is not as much of a problem as
potentially having several runtimes that can't agree on the same
semantics of how to share resources, and how parallel/concurrent tasks
look like.  IOW, we might have to do more "unification" work in the
intermediate representation too (i.e., current GOMP + something +
changes)

> > In the Intel MIC case (the only thing I've looked briefly at for how the
> > offloading works - the COI library) you can load binaries and shared
> > libraries either from files or from host memory image, so e.g. you can
> > embed the libgomp library, some kind of libm and some kind of libc
> > (would that be glibc, newlib, something else?) compiled for the target
> > into some data section inside of the plugin or something
> > (or load it from files of course).  No idea how you do this in the
> > HSAIL case, or PTX.
> 
> For HSA you can do arbitrary calls to CPU code (that will then of course
> execute on the CPU).

Right, which means that we would have a dispatch for both directions,
controlled by some part of the HSA runtime.  This would probably also
mean that the execution of parts executed on the CPU can be parallel, so
the HSA runtime would probably want to do that by calling back into the
generic scheduler code responsible for all parallel/concurrent tasks.

Torvald

Re: [RFC] Offloading Support in libgomp

Reply via email to