On Thu, Oct 22, 2015 at 07:16:49PM +0200, Bernd Schmidt wrote:
> I'm not really familiar with OpenMP and what it allows, so take all my
> comments with a grain of salt.
> 
> On 10/22/2015 06:41 PM, Alexander Monakov wrote:
> >The second approach is to run all threads in the warp all the time, making
> >sure they execute the same code with the same data, and thus build up the 
> >same
> >local state.
> 
> But is that equivalent? If each thread takes the address of a variable on
> its own stack, that's not the same as taking an address once and
> broadcasting it.

Does PTX allow function scope .shared variables (rather than just file
scope)?  If yes, then perhaps all the automatic vars that in theory could be
passed to other threads (i.e. addressable vars) could be then .shared and
the non-addressable ones .local.  In target constructs directly embedded
into host code you can know what variables are shared (which are shared
between teams, then .global, but that is primarily about mapped variables
which are heap allocated and firstprivate vars in target but not teams;
which are shared between threads, then .shared).  In separate functions
where it is unknown if they are called from within teams context (where it
is run by just the first thread in the first warp), or from within parallel
context (where it is run by one or more warps and thus privatized vars need
to be .local or ideally warp-local) and what to do for the SIMD stuff
broadcasts.

        Jakub

Reply via email to