On Mon, 2016-03-07 at 13:26 +0100, Richard Biener wrote:
> On Mon, Mar 7, 2016 at 7:27 AM, Prasad Ghangal <
> prasad.ghan...@gmail.com> wrote:
> > On 6 March 2016 at 21:13, Richard Biener <
> > richard.guent...@gmail.com> wrote:
> > > 
> > > I'll be willing to mentor this.  Though I'd rather have us
> > > starting from scratch and look at having a C-like input language,
> > > even piggy-backing on the C frontend maybe.
> > 
> > That's great. I would like to know scope of the project for gsoc so
> > that I can start preparing for proposal.
> 
> In my view (this may require discussion) the GIMPLE FE provides a way
> to do better unit-testing in GCC as in
> feeding a GIMPLE pass with specific IL to work with rather than
> trying
> to get that into proper shape via a C
> testcase.  Especially making the input IL into that pass stable over
> the development of GCC is hard.

I've been looking at the gimple FE recently, at the above is precisely
my own motivation.  Much of our current testing involves taking a C
file, running the pass pipeline over it, and then verifying properties
of one specific pass, and this worries me, since all of the intervening
passes can change, and thus can change the effective input seen by the
pass we were hoping to test, invalidating the test case.

As part of the "unit tests" idea:
  v1: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00765.html
  v2: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01224.html
  v3: https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02947.html
I attempted to write unit tests for specific passes.  The closest I got
was this, which built the function in tree form, then gimplified it,
then expanded it:
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02954.html

Whilst writing this I attempted to build test cases by constructing IR
directly via API calls, but it became clear to me that that approach
isn't a good one: it's very verbose, and would tie us to the internal
API.

(I think the above patch kit has merit for testing things other than
passes, as a "-fself-test" option, which I want to pursue for gcc 7).

So for testing specific passes, I'd much rather have an input format
for testing individual passes that:
  * can be easily generated by GCC from real test cases
  * ability to turn into unit tests, which implies:
    * human-readable and editable
  * compatibility and stability: can load gcc 7 test cases into gcc 8;
have e.g. gcc 6 generate test cases
  * roundtrippable: can load the format, then save out the IR, and get
the same file, for at least some subset of the format (consider e.g.
location data: explicit location data can be roundtripped, implicit
location data not so much).

...which suggests that we'd want to use gimple dumps as the input
format to a test framework - which leads naturally to the idea of a
gimple frontend.

I'm thinking of something like a testsuite/gimple.dg subdirectory full
of gimple dumps.

We could have a new kind of diagnostic, a "remark", with DejaGnu
directives to detect for it e.g.

  a_5 = b_1 * c_2;  /* { dg-remark "propagated constant; became a_5 =
b_1 * 3" } */

or whatnot. 

I see our dumpfiles as being something aimed at us, whereas remarks
could be aimed at sophisticated end-users; they would be enabled on a
per-pass basis, or perhaps for certain topics (e.g. vectorization) and
could look something like:

foo.c:27:10: remark: loop is not vectorizable since the iterator can be
modified... [-Rvectorization]
foo.c.35:20: ...here

or similar, where the user passed say "-Rvectorization" as a command
line option to request more info on vectorization, and our test suites
could do this.

As a thought-experiment, consider that as well as cc1 etc, we could
have an executable for every pass.  Then you could run individual
passes e.g.:

  $ run-vrp foo.gimple -o bar.gimple
  $ run-switchconv quux.gimple -o baz.gimple

etc.   (I'm not convinced that it makes sense to split things up so
much, but I find it useful for inspiration, for getting ideas about the
things that we could do if we had that level of modularity, especially
from a testing perpective).


FWIW I started looking at the existing gimple FE branch last week.  It
implements a parser for a tuple syntax, rather than the C-like syntax.

The existing parser doeesn't actually generate any gimple IR
internally, it just syntax-checks the input file.  Building IR
internally seemed like a good next step, since I'm sure there are lots
of state issues to sort out.  So I started looking at implementing a
testsuite/gimple.dg/roundtrip subdirectory: the idea is that this would
be full of gimple dumps; the parser would read them in, and then (with
a flag supplied by roundtrip.exp) would write them out, and
roundtrip.exp would compare input to output and require them to be
identical.  I got as far as (partially) building a GIMPLE_ASSIGN
internally when parsing a file containing one.

That said, I don't care for the tuple syntax in the existing gimple
dump format; I'd prefer a C-like syntax.

My thought was to hack up the existing gimple FE branch to change the
parser to accept a more C-like syntax, but...

> A C-like syntax is prefered, a syntax that is also valid C would be
> even more prefered so that you can
> write "torture" testcases that have fixed IL into a specific pass but
> also run as regular testcases through
> the whole optimization pipeline.
> 
> Piggy-backing on the C frontend makes it possible to leave all the
> details of types and declarations
> and global initializers as plain C while interpreting function bodies
> as "GIMPLE" when leaving the frontend.

...it sounds like you have a radically different implementation idea,
in which the gimple frontend effectively becomes part of the C
frontend, with some different behaviors.

> I expect that in the process of completing GIMPLE IL features you'll
> have to add a few GNU C extensions,
> mostly for features used by Ada (self-referential types come to my
> mind).
> 
> I expect the first thing the project needs to do is add the "tooling"
> side, signalling the C frontend it
> should accept GIMPLE (add a -fgimple flag) plus adding a way to input
> IL into a specific pass
> (-ftest=<pass> or a function attribute so it affects only a specific
> function so you can write a testcase
> driver in plain C and have the actual testcase in a single function).
> The first actual frontend
> implementation challenge will then be emitting GIMPLE / CFG / SSA
> directly which I'd do in the
> "genericization" phase.  Adjustments to how the C FE handles
> expressions should be made as well,
> for example I'd remove any promotions done, letting it only literally
> parse expressions.  Maybe
> statement and expression parsing should be forked directly to not
> make
> the C FEs code too unwieldely
> but as said I'd keep type and decl parsing and its data structures as
> is.
> 
> Eventually the dump file format used by GCCs GIMPLE dumps should be
> changed to be valid
> GIMPLE FE inputs (and thus valid C inputs).  Adjustments mainly need
> to be done to basic-block
> labels and PHI nodes.
> 
> I'd first not think about our on-the-side data too much initially
> (range info, points-to info, etc).
> 
> Richard.

Hope this is constructive
Dave

Reply via email to