Hi, On Tue, Oct 20, 2015 at 09:34:22PM +0300, Alexander Monakov wrote: > Hello, > > This patch series moves libgomp/nvptx porting further along to get initial > bits of parallel execution working, mostly unbreaking the testsuite. Please > have a look! I'm interested in feedback, and would like to know if it's > suitable to become a part of a branch. > > This patch series ports enough of libgomp.c to get warp-level parallelism > working for OpenMP offloading. The overall approach is as follows. > > I've opted not to use dynamic parallelism.
in that case, I encourage you to have a look at omp-low.c (and gimple.h) in the hsa branch. Since Cauldron I have improved the code that processes construct such as #omp pragma target teams distribute parallel for so that it creates copies of the target bodies that are suitable for execution as one kernel, (all of it is eventually outlined to one single function). It does not handle more complicated cases (ost notably, supporting reductions well is will require quite some work) but now I do think this is the way to expand code for GPUs at least or the sources where it makes sense. The code is still OpenMP 4.0, I have only just recently started porting it to the 4.5 that landed in trunk. When I am done, I will write-up some overview and post the first patch for review. Let's hope it is not going to take too long. Martin