Re: [og7] Update nvptx_fork/join barrier placement

2018-03-19 Thread Cesar Philippidis
On 03/19/2018 10:02 AM, Tom de Vries wrote: > On 03/19/2018 03:55 PM, Cesar Philippidis wrote: >>> Note that this changes ordering of the vector-neutering jump and >>> worker-neutering jump at the end. In principle, this should not be >>> harmful, but it violates the invariant that vector-neutering

Re: [og7] Update nvptx_fork/join barrier placement

2018-03-19 Thread Tom de Vries
On 03/19/2018 03:55 PM, Cesar Philippidis wrote: Note that this changes ordering of the vector-neutering jump and worker-neutering jump at the end. In principle, this should not be harmful, but it violates the invariant that vector-neutering branch-around code should be as short-lived as possible

Re: [og7] Update nvptx_fork/join barrier placement

2018-03-19 Thread Tom de Vries
On 03/19/2018 03:55 PM, Cesar Philippidis wrote: Is your patch purely for debugging, or are you planning on committing it to og7 and trunk? I plan to commit it. We have no test-cases testing the neutering code order explicitly. So this check is the only thing that allows us to detect regressi

Re: [og7] Update nvptx_fork/join barrier placement

2018-03-19 Thread Cesar Philippidis
On 03/19/2018 07:04 AM, Tom de Vries wrote: > On 03/09/2018 05:55 PM, Cesar Philippidis wrote: >> On 03/09/2018 08:21 AM, Tom de Vries wrote: >>> On 03/09/2018 12:31 AM, Cesar Philippidis wrote: Nvidia Volta GPUs now support warp-level synchronization. >>> >>> Well, let's try to make that stat

Re: [og7] Update nvptx_fork/join barrier placement

2018-03-19 Thread Tom de Vries
On 03/09/2018 05:55 PM, Cesar Philippidis wrote: On 03/09/2018 08:21 AM, Tom de Vries wrote: On 03/09/2018 12:31 AM, Cesar Philippidis wrote: Nvidia Volta GPUs now support warp-level synchronization. Well, let's try to make that statement a bit more precise. All Nvidia architectures have sup

Re: [og7] Update nvptx_fork/join barrier placement

2018-03-09 Thread Cesar Philippidis
On 03/09/2018 08:21 AM, Tom de Vries wrote: > On 03/09/2018 12:31 AM, Cesar Philippidis wrote: >> Nvidia Volta GPUs now support warp-level synchronization. > > Well, let's try to make that statement a bit more precise. > > All Nvidia architectures have supported synchronization of threads in a >

Re: [og7] Update nvptx_fork/join barrier placement

2018-03-09 Thread Tom de Vries
On 03/09/2018 12:31 AM, Cesar Philippidis wrote: Nvidia Volta GPUs now support warp-level synchronization. Well, let's try to make that statement a bit more precise. All Nvidia architectures have supported synchronization of threads in a warp on a very basic level: by means of convergence (an

[og7] Update nvptx_fork/join barrier placement

2018-03-08 Thread Cesar Philippidis
Nvidia Volta GPUs now support warp-level synchronization. As such, the semantics of legacy bar.sync instructions have slightly changed on newer GPUs. The PTX JIT will now, occasionally, emit a warpsync instruction immediately before a bar.sync for Volta GPUs. That implies that warps must be converg