On Thu, 2015-11-12 at 15:43 +0100, Richard Biener wrote: > On Thu, Nov 12, 2015 at 3:31 PM, Tom de Vries <tom_devr...@mentor.com> wrote: > > On 11/11/15 12:03, Richard Biener wrote: > >> > >> On Mon, 9 Nov 2015, Tom de Vries wrote: > >> > >>> On 09/11/15 16:35, Tom de Vries wrote: > >>>> > >>>> Hi, > >>>> > >>>> this patch series for stage1 trunk adds support to: > >>>> - parallelize oacc kernels regions using parloops, and > >>>> - map the loops onto the oacc gang dimension. > >>>> > >>>> The patch series contains these patches: > >>>> > >>>> 1 Insert new exit block only when needed in > >>>> transform_to_exit_first_loop_alt > >>>> 2 Make create_parallel_loop return void > >>>> 3 Ignore reduction clause on kernels directive > >>>> 4 Implement -foffload-alias > >>>> 5 Add in_oacc_kernels_region in struct loop > >>>> 6 Add pass_oacc_kernels > >>>> 7 Add pass_dominator_oacc_kernels > >>>> 8 Add pass_ch_oacc_kernels > >>>> 9 Add pass_parallelize_loops_oacc_kernels > >>>> 10 Add pass_oacc_kernels pass group in passes.def > >>>> 11 Update testcases after adding kernels pass group > >>>> 12 Handle acc loop directive > >>>> 13 Add c-c++-common/goacc/kernels-*.c > >>>> 14 Add gfortran.dg/goacc/kernels-*.f95 > >>>> 15 Add libgomp.oacc-c-c++-common/kernels-*.c > >>>> 16 Add libgomp.oacc-fortran/kernels-*.f95 > >>>> > >>>> The first 9 patches are more or less independent, but patches 10-16 are > >>>> intended to be committed at the same time. > >>>> > >>>> Bootstrapped and reg-tested on x86_64. > >>>> > >>>> Build and reg-tested with nvidia accelerator, in combination with a > >>>> patch that enables accelerator testing (which is submitted at > >>>> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ). > >>>> > >>>> I'll post the individual patches in reply to this message. > >>> > >>> > >>> This patch updates existing testcases with new pass numbers, given the > >>> passes > >>> that were added in the pass list in patch 10. > >> > >> > >> I think it would be nice to be able to specify the number in the .def > >> file instead so we can avoid this kind of churn everytime we do this. > > > > > > How about something along the lines of: > > ... > > /* pass_build_ealias is a dummy pass that ensures that we > > execute TODO_rebuild_alias at this point. */ > > NEXT_PASS (pass_build_ealias); > > /* Pass group that runs when there are oacc kernels in the > > function. */ > > NEXT_PASS (pass_oacc_kernels); > > PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels) > > PUSH_ID ("oacc_kernels") > > ... > > POP_ID () > > POP_INSERT_PASSES () > > NEXT_PASS (pass_fre); > > ... > > > > where the PUSH_ID/POP_ID pair has the functionality that all the contained > > passes: > > - have the id prefixed to the dump file, so the dump file of pass_ch > > which normally is "ch" becomes "oacc_kernels_ch", and > > - the pass name in pass_instances.def becomes pass_oacc_kernels_ch, such > > that it doesn't count as numbered instance of pass_ch > > ? > > Hmm. I'd like to have sth that allows me to add "slp" to both > pass_slp_vectorize > instances, having them share the suffix (as no two functions are in both > dumps). > > We similarly have "duplicates" across the -Og vs. the -O[0-3] pipeline. > > Basically make all dump file name suffixes manually specified which means > moving > them from the class definition to the actual instance. > > Well, just an idea. In a distant future I like our pass pipeline to become > more > dynamic, getting away from a static passes.def towards, say, a pass "script" > (to be able to say "if inlining did nothing skip this group" or similar).
Can't that be done by having a parent pass to hold them, with a gate function? Or are you thinking of having another domain-specific language? Thinking aloud, I've sometimes wondered if it would be helpful to be able to subclass pass_manager, so that multiple passes.def files could generate alternative pass_manager subclasses, with the precise choice of pass_manager subclass being determined by options+target. I don't know if that latter idea is useful though. Dave