On 10/22/2015 08:08 PM, Alexander Monakov wrote:
On Thu, 22 Oct 2015, Bernd Schmidt wrote:
I'm not really familiar with OpenMP and what it allows, so take all my
comments with a grain of salt.
On 10/22/2015 06:41 PM, Alexander Monakov wrote:
The second approach is to run all threads in the warp all the time, making
sure they execute the same code with the same data, and thus build up the
same
local state.
But is that equivalent? If each thread takes the address of a variable on its
own stack, that's not the same as taking an address once and broadcasting it.
Taking the address yields the same pointer in all threads in PTX. Even if it
didn't, broadcasting the pointer is pointless, as stacks are thread-private.
It doesn't yield a pointer pointing to the same location in memory,
which is the point. If you then have code operating on the location
being pointed to, behaviour would be different than what you'd expect if
you were on the host and broadcasted the pointer.
The problem is that you may get a user programs which have this
behaviour, and which may not be supportable. I think that is what Jakub
was trying to say (correct me if I'm wrong).
Bernd