On 11/03/15 10:46, Jakub Jelinek wrote:
On Mon, Nov 02, 2015 at 11:18:37AM -0500, Nathan Sidwell wrote:
This is the core execution bits of OpenACC reductions.
We have a new internal fn 'IFN_GOACC_REDUCTION' and a new target hook
goacc.reduction, to lower it on the target compiler.
So, let me start with a few questions:
1) does OpenACC allow UDRs or only the built-in reductions? If it
does not allow UDRs, do you have it covered by testcases that you
disallow parsing of them (e.g. when you have
no UDR reductions. Will check test cases for that.
#pragma omp declare reduction (xyz: struct S: omp_out.x += omp_in.y)
initializer (omp_priv = { 5 })
#pragma acc parallel reduction (xyz: var_with_type_S)
)?
2) how do you expand the reductions in the end when targetting host fallback
or when targetting non-PTX targets?
That's what default_goacc_reduction is doing.
(I see its comment hasn't caught up with the changes I made during the merge.
Will fix)
LHS-opt = IFN_RED (KIND, RES_PTR, VAR, LEVEL, OP, OFFSET)
If RES_PTR is not integer-zerop:
SETUP - emit 'LHS = *RES_PTR', LHS = NULL
TEARDOWN - emit '*RES_PTR = VAR'
If LHS is not NULL
emit 'LHS = VAR'
This is the correct behaviour for a single-threaded loop. Of course the loop
could go on to be parallelized in the normal way -- or additional conversion to
openmp constructs along the same lines as we discussed for the GOACC_LOOP function.
Does that help?
nathan