On Thu, Jun 25, 2015 at 22:10:58 +0200, Jakub Jelinek wrote: > On Thu, Jun 25, 2015 at 10:45:29PM +0300, Ilya Verbin wrote: > > So, as I understood, three tasks will be generated almost simultaneously in > > foo1: one on host and two on target. > > Target task 1 will be executed immediately. > > Host task will wait for task 1 to be completed on target. > > (Or it is not possible to mix "omp target" and "omp task" dependencies?) > > And task 2 will wait on target for task 1. > > My understanding is that you don't create any extra tasks, > but rather you pointer translate the host address from the start of the > variable (or array section; thus the depend clause argument) into > target address, and check if it can be offloaded right away (no need > to wait for dependencies). If yes, you just offload it, with nowait > without waiting in the caller till it finishes. If not, you arrange > that when some other offloaded job finishes that provides the dependency, > your scheduled job is executed. > So, the task on the target is the implicit one, what executes the > body of the target region. > In tasking (task.c) dependencies are only honored for sibling tasks, > whether the different target implicit tasks are sibling is questionable and > supposedly should be clarified, but I can't imagine they aren't meant to. > So, you don't really need to care about the task.c dependencies, target.c > could have its own ones if it is easier to write it that way. > Supposedly for nowait you want to spawn or queue the job and return right > away, and for queued job stick it into some data structure (supposedly > inside of libgomp on the host) that when the library is (asynchronously) > notified that some offloaded job finished you check the data structures > and spawn something different. Or have the data structures on the offloaded > device instead? > > In any case, I'd look what the Mentor folks are doing for OpenACC async > offloading, what libmicoffload allows you to do and figure out something > from that.
One big question is who will maintain the list of scheduled job, its dependencies, etc. - libgomp or each target plugin? OpenACC has async queues: #pragma acc parallel async(2) wait(1) But it's not possible to have 2 waits like: #pragma acc parallel async(3) wait(1) wait(2) (GOMP_OFFLOAD_openacc_async_wait_async has only one argument with the number of queue to wait) Thomas, please correct me if I'm wrong. In this regard, OpenMP is more complicated, since it allows e.g.: #pragma omp target nowait depend(in: a, b) depend(out: c, d) Currently I'm trying to figure out what liboffloadmic can do. BTW, do you plan to remove GOMP_MAP_POINTER mappings from array sections? The enter/exit patch for libgomp depends on this change. -- Ilya