On 11/03/15 10:46, Jakub Jelinek wrote:
On Mon, Nov 02, 2015 at 11:18:37AM -0500, Nathan Sidwell wrote:
This is the core execution bits of OpenACC reductions.

We have a new internal fn 'IFN_GOACC_REDUCTION' and a new target hook
goacc.reduction, to lower it on the target compiler.

So, let me start with a few questions:
1) does OpenACC allow UDRs or only the built-in reductions?  If it
    does not allow UDRs, do you have it covered by testcases that you
    disallow parsing of them (e.g. when you have

no UDR reductions.  Will check test cases for that.

#pragma omp declare reduction (xyz: struct S: omp_out.x += omp_in.y) 
initializer (omp_priv = { 5 })
#pragma acc parallel reduction (xyz: var_with_type_S)
    )?

2) how do you expand the reductions in the end when targetting host fallback
    or when targetting non-PTX targets?

That's what default_goacc_reduction is doing.

(I see its comment hasn't caught up with the changes I made during the merge. Will fix)

   LHS-opt = IFN_RED (KIND, RES_PTR, VAR, LEVEL, OP, OFFSET)
   If RES_PTR is not integer-zerop:
       SETUP - emit 'LHS = *RES_PTR', LHS = NULL
       TEARDOWN - emit '*RES_PTR = VAR'
   If LHS is not NULL
       emit 'LHS = VAR'

This is the correct behaviour for a single-threaded loop. Of course the loop could go on to be parallelized in the normal way -- or additional conversion to openmp constructs along the same lines as we discussed for the GOACC_LOOP function.

Does that help?

nathan

Reply via email to