Re: [PATCH 0/2] [OpenACC] Kernels loop annotation

Sandra Loosemore Thu, 10 Sep 2020 09:01:21 -0700

On 9/10/20 4:20 AM, Richard Biener wrote:

On Wed, Sep 9, 2020 at 7:55 PM Sandra Loosemore <san...@codesourcery.com> wrote:


This set of patches implements C/C++ and Fortran front end support for
adding "acc loop auto" annotations to loop nests in OpenACC kernels
regions.  For background on this, refer to Thomas Schwinge's talk from
last year's cauldron, at

https://gcc.gnu.org/wiki/cauldron2019talks?action=AttachFile&do=view&target=OpenACC+kernels-cauldron2019.pdf

In particular, pages 20-24 describe this part of the work.  We're
trying to identify loops that might be parallelizable and convert them
to ACC_LOOP tree structures for further analysis, instead of lowering
them to goto form early in compilation, as we do with ordinary
for/while/do loops in C/C++ and DO loops in Fortran.


So the issue I ran into when trying a simplistic "transfer" of DO CONCURRENT
is that variables in DO CONCURRENT scope get moved to function scope
by simplification and nothing prevents optimizers from extending lifetime
of those which means we end up eventually creating additional cross-iteration
dependences and the result is a loop that is no longer satisfying 'DO
CONCURRENT'.

I don't have any background on this issue, but I think it must beorthogonal? My patch only examines EXEC_DO, not EXEC_DO_CONCURRENT.

I realize OACC handling is hacked in place in a set of passes during early
optimization so these kind of transforms simply might not happen "yet"
(by luck - nothing made them "invalid" on GIMPLE).

I didn't look at the how you "annotate" and until when the annotation prevails
(the headers of the two patches don't say so either) so maybe you will
not have such issues by design?

The strategy is pretty simple: it does a code walk to examine theparsed form of ordinary loop constructs (EXEC_DO in Fortran, FOR_STMT inthe newly combined C/C++ representation) within a kernels region. Ifany loop in a nest has an explicit "acc loop" annotation, the annotatorignores that entire nest on the theory that the user has alreadyindicated what parallelism they want, except for combined "acc kernelsloop" directives where the intent in actual code seems to be to try tooptimize the entire nest. It does some sanity checks about modificationof the loop variable in the body of the loop, etc. If it looksplausible, the annotator changes the representation to the equivalent of"acc loop auto", and it's up to later passes to figure out whether"auto" can be compiled as "parallel" or if it has to fall back to "seq".I tried to add a lot of comments throughout the code explaining therationale for the various heuristics and restrictions controlling theannotation.


-Sandra

Re: [PATCH 0/2] [OpenACC] Kernels loop annotation

Reply via email to