On Thu, 22 Oct 2015, Bernd Schmidt wrote:
> On 10/22/2015 08:08 PM, Alexander Monakov wrote: > > On Thu, 22 Oct 2015, Bernd Schmidt wrote: > > > > > I'm not really familiar with OpenMP and what it allows, so take all my > > > comments with a grain of salt. > > > > > > On 10/22/2015 06:41 PM, Alexander Monakov wrote: > > > > The second approach is to run all threads in the warp all the time, > > > > making > > > > sure they execute the same code with the same data, and thus build up > > > > the > > > > same > > > > local state. > > > > > > But is that equivalent? If each thread takes the address of a variable on > > > its > > > own stack, that's not the same as taking an address once and broadcasting > > > it. > > > > Taking the address yields the same pointer in all threads in PTX. Even if > > it > > didn't, broadcasting the pointer is pointless, as stacks are thread-private. > > It doesn't yield a pointer pointing to the same location in memory, which is > the point. If you then have code operating on the location being pointed to, > behaviour would be different than what you'd expect if you were on the host > and broadcasted the pointer. The value in that location would be the same, by construction. You'd only diverge on an attempt to perform an atomic on that location. Alexander