On Thu, Nov 12, 2015 at 3:31 PM, Tom de Vries <tom_devr...@mentor.com> wrote: > On 11/11/15 12:03, Richard Biener wrote: >> >> On Mon, 9 Nov 2015, Tom de Vries wrote: >> >>> On 09/11/15 16:35, Tom de Vries wrote: >>>> >>>> Hi, >>>> >>>> this patch series for stage1 trunk adds support to: >>>> - parallelize oacc kernels regions using parloops, and >>>> - map the loops onto the oacc gang dimension. >>>> >>>> The patch series contains these patches: >>>> >>>> 1 Insert new exit block only when needed in >>>> transform_to_exit_first_loop_alt >>>> 2 Make create_parallel_loop return void >>>> 3 Ignore reduction clause on kernels directive >>>> 4 Implement -foffload-alias >>>> 5 Add in_oacc_kernels_region in struct loop >>>> 6 Add pass_oacc_kernels >>>> 7 Add pass_dominator_oacc_kernels >>>> 8 Add pass_ch_oacc_kernels >>>> 9 Add pass_parallelize_loops_oacc_kernels >>>> 10 Add pass_oacc_kernels pass group in passes.def >>>> 11 Update testcases after adding kernels pass group >>>> 12 Handle acc loop directive >>>> 13 Add c-c++-common/goacc/kernels-*.c >>>> 14 Add gfortran.dg/goacc/kernels-*.f95 >>>> 15 Add libgomp.oacc-c-c++-common/kernels-*.c >>>> 16 Add libgomp.oacc-fortran/kernels-*.f95 >>>> >>>> The first 9 patches are more or less independent, but patches 10-16 are >>>> intended to be committed at the same time. >>>> >>>> Bootstrapped and reg-tested on x86_64. >>>> >>>> Build and reg-tested with nvidia accelerator, in combination with a >>>> patch that enables accelerator testing (which is submitted at >>>> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ). >>>> >>>> I'll post the individual patches in reply to this message. >>> >>> >>> This patch updates existing testcases with new pass numbers, given the >>> passes >>> that were added in the pass list in patch 10. >> >> >> I think it would be nice to be able to specify the number in the .def >> file instead so we can avoid this kind of churn everytime we do this. > > > How about something along the lines of: > ... > /* pass_build_ealias is a dummy pass that ensures that we > execute TODO_rebuild_alias at this point. */ > NEXT_PASS (pass_build_ealias); > /* Pass group that runs when there are oacc kernels in the > function. */ > NEXT_PASS (pass_oacc_kernels); > PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels) > PUSH_ID ("oacc_kernels") > ... > POP_ID () > POP_INSERT_PASSES () > NEXT_PASS (pass_fre); > ... > > where the PUSH_ID/POP_ID pair has the functionality that all the contained > passes: > - have the id prefixed to the dump file, so the dump file of pass_ch > which normally is "ch" becomes "oacc_kernels_ch", and > - the pass name in pass_instances.def becomes pass_oacc_kernels_ch, such > that it doesn't count as numbered instance of pass_ch > ?
Hmm. I'd like to have sth that allows me to add "slp" to both pass_slp_vectorize instances, having them share the suffix (as no two functions are in both dumps). We similarly have "duplicates" across the -Og vs. the -O[0-3] pipeline. Basically make all dump file name suffixes manually specified which means moving them from the class definition to the actual instance. Well, just an idea. In a distant future I like our pass pipeline to become more dynamic, getting away from a static passes.def towards, say, a pass "script" (to be able to say "if inlining did nothing skip this group" or similar). Richard. > Thanks, > - Tom