On Wed, 21 Oct 2015, Bernd Schmidt wrote:

> On 10/20/2015 08:34 PM, Alexander Monakov wrote:
> > This patch series ports enough of libgomp.c to get warp-level parallelism
> > working for OpenMP offloading.  The overall approach is as follows.
> 
> Could you elaborate a bit what you mean by this just so we understand each
> other in terms of terminology? "Warp-level" sounds to me like you have all
> threads in a warp executing in lockstep at all times. If individual threads
> can take different paths, I'd expect it to be called thread-level parallelism
> or something like that.

Sorry, that was unclear.  What I meant is that there is a degree of
parallelism available across different warps, but not across different teams
(because only 1 team is spawned), nor across threads in a warp (because all
threads in a warp except one exit immediately -- later on we'd need to
keep them converged so they can enter a simd region together).
 
> What is your end goal in terms of mapping GPU parallelism onto OpenMP?

OpenMP team is mapped to a CUDA thread block, OpenMP thread is mapped to a
warp, OpenMP simd lane is mapped to a CUDA thread.  So, follow the OpenACC
model.  Like in OpenACC, we'd need to artificially deactivate/reactivate warp
members on simd region boundaires.

Alexander

Reply via email to