PTX: improve correctness in SIMD regions

Richard Biener Wed, 18 Jan 2017 04:46:17 -0800

On Tue, Jan 17, 2017 at 9:07 PM, Alexander Monakov <amona...@ispras.ru> wrote:
> Hello,
>
> This patch series addresses a correctness issue in how OpenMP SIMD regions are
> transformed for SIMT execution.  On NVPTX, OpenMP target code runs with
> per-warp stacks outside of SIMD regions, and needs to transition to per-lane
> stacks on SIMD region boundaries.  Originally the plan was to implement that
> by outlining SIMD loop into a separate function, and switch stacks around the
> function call.  I didn't like that approach due to how it would penalize even
> the simplest SIMD loops, and how it's not convinient to implement in GCC.
>
> These patches implement an alternative approach I didn't see until recently.
> Instead of outlining, collect variables that would need to be on per-lane
> stacks (that is, addressable private variables) to one struct, and allocate
> that struct with an alloca-like function.
>
> After OpenMP lowering, inlining might break this by inlining functions with
> address-taken locals into SIMD regions.  For now, such inlining is disallowed
> (this penalizes only SIMT code), but eventually that can be handled by
> collecting those locals into an allocated struct in a similar manner.


Can you do the allocation fully after inlining instead?

Richard.

> Alexander

Re: [PATCH 0/5] OpenMP/PTX: improve correctness in SIMD regions

Reply via email to