On Mon, Jun 22, 2015 at 12:08:36PM -0400, Nathan Sidwell wrote: > On 06/22/15 11:18, Bernd Schmidt wrote: > > >You can have a hint that it is desirable, but not a hint that it is correct > >(because passes in between may invalidate that). The OpenACC directives > >guarantee to the compiler that the program can be transformed into a parallel > >form. If we lose them early we must then rely on our analysis which may not > >be > >strong enough to prove that the loop can be parallelized. If we make these > >transformations early enough, while we still have the OpenACC directives, we > >can > >guarantee that we do exactly what the programmer specified. > > How does this differ from openmp's needs to preserve parallelism on a > parallel loop? Is it more than the reconvergence issue?
OpenMP has significantly different execution model, a parallel block in OpenMP is run by certain number of threads (the initial thread (the one encountering that region) and then dpeending on clauses and library decisions perhaps others), with a barrier at the end of the region, and afterwards only the initial thread continues again. So, an OpenMP parallel is implemented as a library call, taking outlined function from the parallel's body as one of its arguments and the body is executed by the initial thread and perhaps others. OpenMP worksharing loop is just coordination between the threads in the team, which thread takes which subset of the loop's iterations, and optionally followed by a barrier. OpenMP simd loop is a loop that has certain properties guaranteed by the user and can be vectorized. In contrast to this, OpenACC spawns all the threads/CTAs upfront, and then idles on some of them until there is work for them. Jakub