On Wed, Oct 21, 2015 at 03:49:08PM -0400, Nathan Sidwell wrote:
> This patch is the device-specific half of the previous patch.  It processes
> the partition head & tail markers and loop abstraction functions inserted
> during omp lowering.
> 
> In the oacc_device_lower pass we scan the CFG reconstructing the set of
> nested loops demarked by IFN_UNIQUE (HEAD_MARK) & IFN_UNIQUE (TAIL_MARK)
> functions. The HEAD_MARK function provides  the loop partition information
> provided by the user.  Once constructed we can iterate over that structure
> checking partitioning consistency (for instance an inner loop must use a
> dimension 'inside' an outer loop). We also assign specific partitioning axes
> here.  Partitioning updates the parameters of the IFN_LOOP and IFN_FORK/JOIN
> functions appropriately.
> 
> Once partitioning has been determined, we iterate over the CFG scanning for
> the marker, fork/join and loop functions.  The marker functions are deleted,
> the fork & join functions are conditionally deleted (using the target hook
> of patch 3), and the loop function is expanded into code calculating the
> loop parameters depending on how the loop has been partitioned.  This  uses
> the OACC_DIM_POS and OACC_DIM_SIZE builtins included in patch 7.

So, how do you expand the OACC loops on non-PTX devices (host, or say
XeonPhi)?  Do you drop the IFNs and replace stuff with normal loops?
I don't see anything that would e.g. set the various flags that e.g. OpenMP
#pragma omp simd or Cilk+ #pragma simd sets, like loop->safelen,
loop->force_vectorize, maybe loop->simduid and promote some vars to simduid
arrays if that is relevant to OpenACC.

        Jakub

Reply via email to