Re: collaborative tuning of GCC optimization heuristic

2016-03-07 Thread Grigori Fursin

Hi David,

Thanks a lot for a good question - I completely forgot to discuss that.

Current workloads in the CK are just to test our collaborative 
optimization prototype.
They are even a bit outdated (benchmarks and codelets from the MILEPOST 
project).


However, our point is to make an open system where the community can
add any workload via GitHub with some meta information in JSON format
to be able to participate in collaborative tuning. This meta information 
exposes

data sets used, command lines, input/output files, etc. This helps add
multiple data sets for a given benchmark or even reuse already shared ones.
Finally, this meta information makes it relatively straightforward to apply
predictive analytics to find correlations between workloads and 
optimizations.


Our hope is to eventually make a large and diverse pool of public 
workloads.

In such case, users themselves can derive representative workloads
for their requirements (performance, code size, energy, resource 
constraints,
etc) and a target hardware.  Furthermore, since optimization spaces are 
huge
and it is infeasible to explore them by one user or even in one data 
center,

our approach allows all shared workloads to continuously participate
in crowdtuning,  i.e. searching for good optimizations across diverse 
platforms

while recording "unexpected behavior".

Actually, adding more workloads to CK (while making this process more 
user-friendly)

and tuning them can be a GSOC project - we can help with that ...

You can find more about our view here:
* http://arxiv.org/abs/1506.06256
* https://hal.inria.fr/hal-01054763

Hope it makes sense and take care,
Grigori

On 05/03/2016 16:16, David Edelsohn wrote:

On Sat, Mar 5, 2016 at 9:13 AM, Grigori Fursin  wrote:

Dear colleagues,

If it's of interest, we have released a new version of our open-source
framework to share compiler optimization knowledge across diverse workloads
and hardware. We would like to thank all the volunteers who ran this
framework and shared some results for GCC 4.9 .. 6.0 in the public
repository here: http://cTuning.org/crowdtuning-results-gcc

Here is a brief note how this framework for crowdtuning compiler
optimization heuristics works (for more details, please see
https://github.com/ctuning/ck/wiki/Crowdsource_Experiments): you just
install a small Android app
(https://play.google.com/store/apps/details?id=openscience.crowdsource.experiments)
or python-based Collective Knowledge framework
(http://github.com/ctuning/ck). This program sends system properties to a
public server. The server compiles a random shared workload using some flag
combinations that have been found to work well on similar machines, as well
as some new random ones. The client executes the compiled workload several
times to account for variability etc, and sends the results back to the
server.

If a combination of compiler flags is found that improves performance over
the combinations found so far, it gets reduced (by removing flags that do
now affect the performance) and uploaded to a public repository.
Importantly, if a combination significantly degrades performance for a
particular workload, it gets recorded as well. This potentially points to a
problem with optimization heuristics for a particular target, which may be
worth investigating and improving.

At the moment, only global GCC compiler flags are exposed for collaborative
optimization. Longer term, it can be useful to cover finer-grain
transformation decisions (vectorization, unrolling, etc) via plugin
interface. Please, note that this is a prototype framework and much more can
be done! Please get in touch if you are interested to know more or
contribute!

Thanks for creating and sharing this interesting framework.

I think a central issue is the "random shared workload" because the
optimal optimizations and optimization pipeline are
application-dependent.  The proposed changes to the heuristics may
benefit for the particular set of workloads that the framework tests
but why are those workloads and particular implementations of the
workloads representative for applications of interest to end users of
GCC?   GCC is turned for an arbitrary set of workloads, but why are
the workloads from cTuning any better?

Thanks, David





Re: Validity of SUBREG+AND-imm transformations

2016-03-07 Thread Kyrill Tkachov


On 05/03/16 05:52, Jeff Law wrote:

On 03/04/2016 09:33 AM, Kyrill Tkachov wrote:


On 04/03/16 16:21, Jeff Law wrote:

On 03/04/2016 08:05 AM, Richard Biener wrote:

does that mean that the shift amount should be DImode?
Seems like a more flexible approach would be for the midend to be able
to handle these things...


Or macroize for all integer modes?

That's probably worth exploring.  I wouldn't be at all surprised if it
that turns out to be better than any individual mode,  not just for
arm & aarch64, but would help a variety of targets.



What do you mean by 'macroize' here? Do you mean use iterators to create
multple variants of patterns with different
modes on the shift amount?
I believe we'd still run into the issue at
https://gcc.gnu.org/ml/gcc/2016-03/msg00036.html.

We might, but I would expect the the number of incidences to be fewer.

Essentially we're giving the compiler multiple options when it comes to representation of the shift amount -- allowing the compiler (combine in particular) to use the shift amount in whatever mode is most natural. ie, if the count is 
sitting in a QI, HI, SI or possibly even a DI register, then it can be used as-is.  No subregs, no zero/sign extensions, or and-imm masking.




The RTL documentation for ASHIFT and friends says that the shift amount must be:
"a fixed-point mode or be a constant with mode @code{VOIDmode}; which
mode is determined by the mode called for in the machine description
entry for the left-shift instruction". For example, on the VAX, the mode
of @var{c} is @code{QImode} regardless of @var{m}.

From what I understand the "ashl" standard name should expand to an ASHIFT with 
a particular mode for the shift amount.
Currently that's QImode for aarch64.
So whenever combine tries to propagate anything into the shift amount it has to 
force it into QImode.
I don't see how specifying multiple matching patterns for different modes will 
help as combine propagates
the and-immediate operation into the shift amounts, creates the awkward subreg 
and tries to match that.
It won't try different modes on the shift amount to help matching (and from 
previous discussions I understand
that's not the direction we want combine to take).

I've filed PR 70119 to track this down easier (sourceware archives cut of the 
thread across months :( )
with example code.

Thanks for the ideas,
Kyrill


jeff




Re: [gimplefe] [gsoc16] Gimple Front End Project

2016-03-07 Thread Richard Biener
On Mon, Mar 7, 2016 at 7:27 AM, Prasad Ghangal  wrote:
> On 6 March 2016 at 21:13, Richard Biener  wrote:
>>
>> I'll be willing to mentor this.  Though I'd rather have us starting from 
>> scratch and look at having a C-like input language, even piggy-backing on 
>> the C frontend maybe.
>
> That's great. I would like to know scope of the project for gsoc so
> that I can start preparing for proposal.

In my view (this may require discussion) the GIMPLE FE provides a way
to do better unit-testing in GCC as in
feeding a GIMPLE pass with specific IL to work with rather than trying
to get that into proper shape via a C
testcase.  Especially making the input IL into that pass stable over
the development of GCC is hard.

A C-like syntax is prefered, a syntax that is also valid C would be
even more prefered so that you can
write "torture" testcases that have fixed IL into a specific pass but
also run as regular testcases through
the whole optimization pipeline.

Piggy-backing on the C frontend makes it possible to leave all the
details of types and declarations
and global initializers as plain C while interpreting function bodies
as "GIMPLE" when leaving the frontend.

I expect that in the process of completing GIMPLE IL features you'll
have to add a few GNU C extensions,
mostly for features used by Ada (self-referential types come to my mind).

I expect the first thing the project needs to do is add the "tooling"
side, signalling the C frontend it
should accept GIMPLE (add a -fgimple flag) plus adding a way to input
IL into a specific pass
(-ftest= or a function attribute so it affects only a specific
function so you can write a testcase
driver in plain C and have the actual testcase in a single function).
The first actual frontend
implementation challenge will then be emitting GIMPLE / CFG / SSA
directly which I'd do in the
"genericization" phase.  Adjustments to how the C FE handles
expressions should be made as well,
for example I'd remove any promotions done, letting it only literally
parse expressions.  Maybe
statement and expression parsing should be forked directly to not make
the C FEs code too unwieldely
but as said I'd keep type and decl parsing and its data structures as is.

Eventually the dump file format used by GCCs GIMPLE dumps should be
changed to be valid
GIMPLE FE inputs (and thus valid C inputs).  Adjustments mainly need
to be done to basic-block
labels and PHI nodes.

I'd first not think about our on-the-side data too much initially
(range info, points-to info, etc).

Richard.

>
>>
>> Richard.
>>
>
> --
> Thanks and Regards,
> Prasad Ghangal


Bootstrapping is currently broken

2016-03-07 Thread Dominik Vogt
A recent patch has broken bootstrapping (s390x) in stage3.  The
failure creeped into trunk between friday and today:

-- snip --
g++ -std=gnu++98   -g -O2 -DIN_GCC -fno-exceptions -fno-rtti 
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
-Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic 
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror   
-DHAVE_CONFIG_H -DGENERATOR_FILE -fno-PIE  -no-pie -o build/gencondmd \
build/gencondmd.o .././libiberty/libiberty.a
g++: error: unrecognized command line option ‘-no-pie’
-- snip --

(The compiler in PATH is "gcc (GCC) 4.8.5 20150623 (Red Hat
4.8.5-1)").

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: Bootstrapping is currently broken

2016-03-07 Thread Richard Biener
On Mon, Mar 7, 2016 at 2:12 PM, Dominik Vogt  wrote:
> A recent patch has broken bootstrapping (s390x) in stage3.  The
> failure creeped into trunk between friday and today:
>
> -- snip --
> g++ -std=gnu++98   -g -O2 -DIN_GCC -fno-exceptions -fno-rtti 
> -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
> -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic 
> -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror   
> -DHAVE_CONFIG_H -DGENERATOR_FILE -fno-PIE  -no-pie -o build/gencondmd \
> build/gencondmd.o .././libiberty/libiberty.a
> g++: error: unrecognized command line option ‘-no-pie’
> -- snip --
>
> (The compiler in PATH is "gcc (GCC) 4.8.5 20150623 (Red Hat
> 4.8.5-1)").

Bootstrap should use the built compiler from stage2 in stage3, not
sure how you get the system compiler used there.

Richard.

> Ciao
>
> Dominik ^_^  ^_^
>
> --
>
> Dominik Vogt
> IBM Germany
>


Re: Bootstrapping is currently broken

2016-03-07 Thread Dominik Vogt
On Mon, Mar 07, 2016 at 03:00:03PM +0100, Richard Biener wrote:
> On Mon, Mar 7, 2016 at 2:12 PM, Dominik Vogt  wrote:
> > A recent patch has broken bootstrapping (s390x) in stage3.  The
> > failure creeped into trunk between friday and today:
> >
> > -- snip --
> > g++ -std=gnu++98   -g -O2 -DIN_GCC -fno-exceptions -fno-rtti 
> > -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
> > -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic 
> > -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror   
> > -DHAVE_CONFIG_H -DGENERATOR_FILE -fno-PIE  -no-pie -o build/gencondmd \
> > build/gencondmd.o .././libiberty/libiberty.a
> > g++: error: unrecognized command line option ‘-no-pie’
> > -- snip --
> >
> > (The compiler in PATH is "gcc (GCC) 4.8.5 20150623 (Red Hat
> > 4.8.5-1)").
> 
> Bootstrap should use the built compiler from stage2 in stage3, not
> sure how you get the system compiler used there.

I guess some configure script failed to notice that g++ is not
being built, and a recent change introduced an option that the
installed compiler doesn't have?  Probably configure should throw
an error if bootstrapping is enabled but the c++ language is not
enabled?

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: Bootstrapping is currently broken

2016-03-07 Thread Richard Biener
On Mon, Mar 7, 2016 at 3:12 PM, Dominik Vogt  wrote:
> On Mon, Mar 07, 2016 at 03:00:03PM +0100, Richard Biener wrote:
>> On Mon, Mar 7, 2016 at 2:12 PM, Dominik Vogt  wrote:
>> > A recent patch has broken bootstrapping (s390x) in stage3.  The
>> > failure creeped into trunk between friday and today:
>> >
>> > -- snip --
>> > g++ -std=gnu++98   -g -O2 -DIN_GCC -fno-exceptions -fno-rtti 
>> > -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
>> > -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic 
>> > -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror   
>> > -DHAVE_CONFIG_H -DGENERATOR_FILE -fno-PIE  -no-pie -o build/gencondmd \
>> > build/gencondmd.o .././libiberty/libiberty.a
>> > g++: error: unrecognized command line option ‘-no-pie’
>> > -- snip --
>> >
>> > (The compiler in PATH is "gcc (GCC) 4.8.5 20150623 (Red Hat
>> > 4.8.5-1)").
>>
>> Bootstrap should use the built compiler from stage2 in stage3, not
>> sure how you get the system compiler used there.
>
> I guess some configure script failed to notice that g++ is not
> being built, and a recent change introduced an option that the
> installed compiler doesn't have?  Probably configure should throw
> an error if bootstrapping is enabled but the c++ language is not
> enabled?

It is always enabled, are you sure it's not a pilot error?

Richard.

> Ciao
>
> Dominik ^_^  ^_^
>
> --
>
> Dominik Vogt
> IBM Germany
>


Re: Bootstrapping is currently broken

2016-03-07 Thread Dominik Vogt
On Mon, Mar 07, 2016 at 03:18:34PM +0100, Richard Biener wrote:
> On Mon, Mar 7, 2016 at 3:12 PM, Dominik Vogt  wrote:
> > On Mon, Mar 07, 2016 at 03:00:03PM +0100, Richard Biener wrote:
> >> On Mon, Mar 7, 2016 at 2:12 PM, Dominik Vogt  
> >> wrote:
> >> > A recent patch has broken bootstrapping (s390x) in stage3.  The
> >> > failure creeped into trunk between friday and today:
> >> >
> >> > -- snip --
> >> > g++ -std=gnu++98   -g -O2 -DIN_GCC -fno-exceptions -fno-rtti 
> >> > -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
> >> > -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic 
> >> > -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror   
> >> > -DHAVE_CONFIG_H -DGENERATOR_FILE -fno-PIE  -no-pie -o build/gencondmd \
> >> > build/gencondmd.o .././libiberty/libiberty.a
> >> > g++: error: unrecognized command line option ‘-no-pie’
> >> > -- snip --
> >> >
> >> > (The compiler in PATH is "gcc (GCC) 4.8.5 20150623 (Red Hat
> >> > 4.8.5-1)").
> >>
> >> Bootstrap should use the built compiler from stage2 in stage3, not
> >> sure how you get the system compiler used there.
> >
> > I guess some configure script failed to notice that g++ is not
> > being built, and a recent change introduced an option that the
> > installed compiler doesn't have?  Probably configure should throw
> > an error if bootstrapping is enabled but the c++ language is not
> > enabled?
> 
> It is always enabled, are you sure it's not a pilot error?

Sorry, I don't understand the term "pilot error".  I'm currently
rerunning the build to make sure I'm looking at the right compile
log.  There are so many test builds on the disk that I'm not quite
sure I've looked at the right build log.  (The bootstrap error is
real though.).

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: Bootstrapping is currently broken

2016-03-07 Thread Andreas Krebbel
On 03/07/2016 02:12 PM, Dominik Vogt wrote:
> A recent patch has broken bootstrapping (s390x) in stage3.  The
> failure creeped into trunk between friday and today:
> 
> -- snip --
> g++ -std=gnu++98   -g -O2 -DIN_GCC -fno-exceptions -fno-rtti 
> -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
> -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic 
> -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror   
> -DHAVE_CONFIG_H -DGENERATOR_FILE -fno-PIE  -no-pie -o build/gencondmd \
> build/gencondmd.o .././libiberty/libiberty.a
> g++: error: unrecognized command line option ‘-no-pie’
> -- snip --
> 
> (The compiler in PATH is "gcc (GCC) 4.8.5 20150623 (Red Hat
> 4.8.5-1)").

Bootstrap on s390x worked fine for me with r234017 from last night.

We usually have the daily build system for that but it is down after migrating 
to a new machine. I'm
working on it.

-Andreas-



Re: [gimplefe] [gsoc16] Gimple Front End Project

2016-03-07 Thread David Malcolm
On Mon, 2016-03-07 at 13:26 +0100, Richard Biener wrote:
> On Mon, Mar 7, 2016 at 7:27 AM, Prasad Ghangal <
> prasad.ghan...@gmail.com> wrote:
> > On 6 March 2016 at 21:13, Richard Biener <
> > richard.guent...@gmail.com> wrote:
> > > 
> > > I'll be willing to mentor this.  Though I'd rather have us
> > > starting from scratch and look at having a C-like input language,
> > > even piggy-backing on the C frontend maybe.
> > 
> > That's great. I would like to know scope of the project for gsoc so
> > that I can start preparing for proposal.
> 
> In my view (this may require discussion) the GIMPLE FE provides a way
> to do better unit-testing in GCC as in
> feeding a GIMPLE pass with specific IL to work with rather than
> trying
> to get that into proper shape via a C
> testcase.  Especially making the input IL into that pass stable over
> the development of GCC is hard.

I've been looking at the gimple FE recently, at the above is precisely
my own motivation.  Much of our current testing involves taking a C
file, running the pass pipeline over it, and then verifying properties
of one specific pass, and this worries me, since all of the intervening
passes can change, and thus can change the effective input seen by the
pass we were hoping to test, invalidating the test case.

As part of the "unit tests" idea:
  v1: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00765.html
  v2: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01224.html
  v3: https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02947.html
I attempted to write unit tests for specific passes.  The closest I got
was this, which built the function in tree form, then gimplified it,
then expanded it:
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02954.html

Whilst writing this I attempted to build test cases by constructing IR
directly via API calls, but it became clear to me that that approach
isn't a good one: it's very verbose, and would tie us to the internal
API.

(I think the above patch kit has merit for testing things other than
passes, as a "-fself-test" option, which I want to pursue for gcc 7).

So for testing specific passes, I'd much rather have an input format
for testing individual passes that:
  * can be easily generated by GCC from real test cases
  * ability to turn into unit tests, which implies:
* human-readable and editable
  * compatibility and stability: can load gcc 7 test cases into gcc 8;
have e.g. gcc 6 generate test cases
  * roundtrippable: can load the format, then save out the IR, and get
the same file, for at least some subset of the format (consider e.g.
location data: explicit location data can be roundtripped, implicit
location data not so much).

...which suggests that we'd want to use gimple dumps as the input
format to a test framework - which leads naturally to the idea of a
gimple frontend.

I'm thinking of something like a testsuite/gimple.dg subdirectory full
of gimple dumps.

We could have a new kind of diagnostic, a "remark", with DejaGnu
directives to detect for it e.g.

  a_5 = b_1 * c_2;  /* { dg-remark "propagated constant; became a_5 =
b_1 * 3" } */

or whatnot. 

I see our dumpfiles as being something aimed at us, whereas remarks
could be aimed at sophisticated end-users; they would be enabled on a
per-pass basis, or perhaps for certain topics (e.g. vectorization) and
could look something like:

foo.c:27:10: remark: loop is not vectorizable since the iterator can be
modified... [-Rvectorization]
foo.c.35:20: ...here

or similar, where the user passed say "-Rvectorization" as a command
line option to request more info on vectorization, and our test suites
could do this.

As a thought-experiment, consider that as well as cc1 etc, we could
have an executable for every pass.  Then you could run individual
passes e.g.:

  $ run-vrp foo.gimple -o bar.gimple
  $ run-switchconv quux.gimple -o baz.gimple

etc.   (I'm not convinced that it makes sense to split things up so
much, but I find it useful for inspiration, for getting ideas about the
things that we could do if we had that level of modularity, especially
from a testing perpective).


FWIW I started looking at the existing gimple FE branch last week.  It
implements a parser for a tuple syntax, rather than the C-like syntax.

The existing parser doeesn't actually generate any gimple IR
internally, it just syntax-checks the input file.  Building IR
internally seemed like a good next step, since I'm sure there are lots
of state issues to sort out.  So I started looking at implementing a
testsuite/gimple.dg/roundtrip subdirectory: the idea is that this would
be full of gimple dumps; the parser would read them in, and then (with
a flag supplied by roundtrip.exp) would write them out, and
roundtrip.exp would compare input to output and require them to be
identical.  I got as far as (partially) building a GIMPLE_ASSIGN
internally when parsing a file containing one.

That said, I don't care for the tuple syntax in the existing gimple
dump format; I'd prefer a C-like 

Re: Validity of SUBREG+AND-imm transformations

2016-03-07 Thread Richard Sandiford
Kyrill Tkachov  writes:
> On 05/03/16 05:52, Jeff Law wrote:
>> On 03/04/2016 09:33 AM, Kyrill Tkachov wrote:
>>>
>>> On 04/03/16 16:21, Jeff Law wrote:
 On 03/04/2016 08:05 AM, Richard Biener wrote:
>> does that mean that the shift amount should be DImode?
>> Seems like a more flexible approach would be for the midend to be able
>> to handle these things...
>
> Or macroize for all integer modes?
 That's probably worth exploring.  I wouldn't be at all surprised if it
 that turns out to be better than any individual mode,  not just for
 arm & aarch64, but would help a variety of targets.

>>>
>>> What do you mean by 'macroize' here? Do you mean use iterators to create
>>> multple variants of patterns with different
>>> modes on the shift amount?
>>> I believe we'd still run into the issue at
>>> https://gcc.gnu.org/ml/gcc/2016-03/msg00036.html.
>> We might, but I would expect the the number of incidences to be fewer.
>>
>> Essentially we're giving the compiler multiple options when it comes
>> to representation of the shift amount -- allowing the compiler
>> (combine in particular) to use the shift amount in whatever mode is
>> most natural. ie, if the count is
>> sitting in a QI, HI, SI or possibly even a DI register, then it can be
>> used as-is.  No subregs, no zero/sign extensions, or and-imm masking.
>>
>
> The RTL documentation for ASHIFT and friends says that the shift amount must 
> be:
> "a fixed-point mode or be a constant with mode @code{VOIDmode}; which
> mode is determined by the mode called for in the machine description
> entry for the left-shift instruction". For example, on the VAX, the mode
> of @var{c} is @code{QImode} regardless of @var{m}.
>
>  From what I understand the "ashl" standard name should expand to an
> ASHIFT with a particular mode for the shift amount.  Currently that's
> QImode for aarch64.  So whenever combine tries to propagate anything
> into the shift amount it has to force it into QImode.

Yeah.  It'd be nice to retain a predictable mode if possible.  One of
the things on my todo list for GCC 7 is to automatically generate a
function that gives you the correct shift amount mode for a particular
shift value mode, so that there's a lot less guessing.  (It's one of the
main blockers to having modes on CONST_INTs.)

Thanks,
Richard


Re: Validity of SUBREG+AND-imm transformations

2016-03-07 Thread Jeff Law

On 03/07/2016 03:44 AM, Kyrill Tkachov wrote:




The RTL documentation for ASHIFT and friends says that the shift amount
must be:
"a fixed-point mode or be a constant with mode @code{VOIDmode}; which
mode is determined by the mode called for in the machine description
entry for the left-shift instruction". For example, on the VAX, the mode
of @var{c} is @code{QImode} regardless of @var{m}.
Use QImode in the named pattern/expander and use the other modes in an 
unnamed/anonymous pattern.

Jeff


Re: [gimplefe] [gsoc16] Gimple Front End Project

2016-03-07 Thread Trevor Saunders
On Mon, Mar 07, 2016 at 11:33:55AM -0500, David Malcolm wrote:
> On Mon, 2016-03-07 at 13:26 +0100, Richard Biener wrote:
> > On Mon, Mar 7, 2016 at 7:27 AM, Prasad Ghangal <
> > prasad.ghan...@gmail.com> wrote:
> > > On 6 March 2016 at 21:13, Richard Biener <
> > > richard.guent...@gmail.com> wrote:
> > > > 
> > > > I'll be willing to mentor this.  Though I'd rather have us
> > > > starting from scratch and look at having a C-like input language,
> > > > even piggy-backing on the C frontend maybe.
> > > 
> > > That's great. I would like to know scope of the project for gsoc so
> > > that I can start preparing for proposal.
> > 
> > In my view (this may require discussion) the GIMPLE FE provides a way
> > to do better unit-testing in GCC as in
> > feeding a GIMPLE pass with specific IL to work with rather than
> > trying
> > to get that into proper shape via a C
> > testcase.  Especially making the input IL into that pass stable over
> > the development of GCC is hard.
> 
> I've been looking at the gimple FE recently, at the above is precisely
> my own motivation.  Much of our current testing involves taking a C
> file, running the pass pipeline over it, and then verifying properties
> of one specific pass, and this worries me, since all of the intervening
> passes can change, and thus can change the effective input seen by the
> pass we were hoping to test, invalidating the test case.
> 
> As part of the "unit tests" idea:
>   v1: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00765.html
>   v2: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01224.html
>   v3: https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02947.html
> I attempted to write unit tests for specific passes.  The closest I got
> was this, which built the function in tree form, then gimplified it,
> then expanded it:
> https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02954.html
> 
> Whilst writing this I attempted to build test cases by constructing IR
> directly via API calls, but it became clear to me that that approach
> isn't a good one: it's very verbose, and would tie us to the internal
> API.
> 
> (I think the above patch kit has merit for testing things other than
> passes, as a "-fself-test" option, which I want to pursue for gcc 7).
> 
> So for testing specific passes, I'd much rather have an input format
> for testing individual passes that:
>   * can be easily generated by GCC from real test cases
>   * ability to turn into unit tests, which implies:
> * human-readable and editable
>   * compatibility and stability: can load gcc 7 test cases into gcc 8;
> have e.g. gcc 6 generate test cases
>   * roundtrippable: can load the format, then save out the IR, and get
> the same file, for at least some subset of the format (consider e.g.
> location data: explicit location data can be roundtripped, implicit
> location data not so much).
> 
> ...which suggests that we'd want to use gimple dumps as the input
> format to a test framework - which leads naturally to the idea of a
> gimple frontend.

Assuming you mean the format from -fdump-tree-* that's a kind of C like
language so argues against using tooples like the existing gimple-fe
branch.

> I'm thinking of something like a testsuite/gimple.dg subdirectory full
> of gimple dumps.
> 
> We could have a new kind of diagnostic, a "remark", with DejaGnu
> directives to detect for it e.g.
> 
>   a_5 = b_1 * c_2;  /* { dg-remark "propagated constant; became a_5 =
> b_1 * 3" } */
> 
> or whatnot. 
> 
> I see our dumpfiles as being something aimed at us, whereas remarks
> could be aimed at sophisticated end-users; they would be enabled on a
> per-pass basis, or perhaps for certain topics (e.g. vectorization) and
> could look something like:

That's interesting, as you sort of note the other option is to just scan
the output dump for what you intend to check.  The remark idea is
interesting though, the -Wsuggest-final-{method,type} warnings are
trying to be that, and istr something else like that.

> foo.c:27:10: remark: loop is not vectorizable since the iterator can be
> modified... [-Rvectorization]
> foo.c.35:20: ...here
> 
> or similar, where the user passed say "-Rvectorization" as a command
> line option to request more info on vectorization, and our test suites
> could do this.
> 
> As a thought-experiment, consider that as well as cc1 etc, we could
> have an executable for every pass.  Then you could run individual
> passes e.g.:
> 
>   $ run-vrp foo.gimple -o bar.gimple
>   $ run-switchconv quux.gimple -o baz.gimple
> 
> etc.   (I'm not convinced that it makes sense to split things up so
> much, but I find it useful for inspiration, for getting ideas about the
> things that we could do if we had that level of modularity, especially
> from a testing perpective).

yeah, though if you got rid of most / all of the other global state
maybe it wouldn't be hard?  but yeah it doesn't seem like the most
important thing either.

> FWIW I started looking at the existing gimple FE branch last week.