Hello, This patch series addresses a correctness issue in how OpenMP SIMD regions are transformed for SIMT execution. On NVPTX, OpenMP target code runs with per-warp stacks outside of SIMD regions, and needs to transition to per-lane stacks on SIMD region boundaries. Originally the plan was to implement that by outlining SIMD loop into a separate function, and switch stacks around the function call. I didn't like that approach due to how it would penalize even the simplest SIMD loops, and how it's not convinient to implement in GCC.
These patches implement an alternative approach I didn't see until recently. Instead of outlining, collect variables that would need to be on per-lane stacks (that is, addressable private variables) to one struct, and allocate that struct with an alloca-like function. After OpenMP lowering, inlining might break this by inlining functions with address-taken locals into SIMD regions. For now, such inlining is disallowed (this penalizes only SIMT code), but eventually that can be handled by collecting those locals into an allocated struct in a similar manner. Alexander