On Mon, Mar 07, 2016 at 11:33:55AM -0500, David Malcolm wrote: > On Mon, 2016-03-07 at 13:26 +0100, Richard Biener wrote: > > On Mon, Mar 7, 2016 at 7:27 AM, Prasad Ghangal < > > prasad.ghan...@gmail.com> wrote: > > > On 6 March 2016 at 21:13, Richard Biener < > > > richard.guent...@gmail.com> wrote: > > > > > > > > I'll be willing to mentor this. Though I'd rather have us > > > > starting from scratch and look at having a C-like input language, > > > > even piggy-backing on the C frontend maybe. > > > > > > That's great. I would like to know scope of the project for gsoc so > > > that I can start preparing for proposal. > > > > In my view (this may require discussion) the GIMPLE FE provides a way > > to do better unit-testing in GCC as in > > feeding a GIMPLE pass with specific IL to work with rather than > > trying > > to get that into proper shape via a C > > testcase. Especially making the input IL into that pass stable over > > the development of GCC is hard. > > I've been looking at the gimple FE recently, at the above is precisely > my own motivation. Much of our current testing involves taking a C > file, running the pass pipeline over it, and then verifying properties > of one specific pass, and this worries me, since all of the intervening > passes can change, and thus can change the effective input seen by the > pass we were hoping to test, invalidating the test case. > > As part of the "unit tests" idea: > v1: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00765.html > v2: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01224.html > v3: https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02947.html > I attempted to write unit tests for specific passes. The closest I got > was this, which built the function in tree form, then gimplified it, > then expanded it: > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02954.html > > Whilst writing this I attempted to build test cases by constructing IR > directly via API calls, but it became clear to me that that approach > isn't a good one: it's very verbose, and would tie us to the internal > API. > > (I think the above patch kit has merit for testing things other than > passes, as a "-fself-test" option, which I want to pursue for gcc 7). > > So for testing specific passes, I'd much rather have an input format > for testing individual passes that: > * can be easily generated by GCC from real test cases > * ability to turn into unit tests, which implies: > * human-readable and editable > * compatibility and stability: can load gcc 7 test cases into gcc 8; > have e.g. gcc 6 generate test cases > * roundtrippable: can load the format, then save out the IR, and get > the same file, for at least some subset of the format (consider e.g. > location data: explicit location data can be roundtripped, implicit > location data not so much). > > ...which suggests that we'd want to use gimple dumps as the input > format to a test framework - which leads naturally to the idea of a > gimple frontend.
Assuming you mean the format from -fdump-tree-* that's a kind of C like language so argues against using tooples like the existing gimple-fe branch. > I'm thinking of something like a testsuite/gimple.dg subdirectory full > of gimple dumps. > > We could have a new kind of diagnostic, a "remark", with DejaGnu > directives to detect for it e.g. > > a_5 = b_1 * c_2; /* { dg-remark "propagated constant; became a_5 = > b_1 * 3" } */ > > or whatnot. > > I see our dumpfiles as being something aimed at us, whereas remarks > could be aimed at sophisticated end-users; they would be enabled on a > per-pass basis, or perhaps for certain topics (e.g. vectorization) and > could look something like: That's interesting, as you sort of note the other option is to just scan the output dump for what you intend to check. The remark idea is interesting though, the -Wsuggest-final-{method,type} warnings are trying to be that, and istr something else like that. > foo.c:27:10: remark: loop is not vectorizable since the iterator can be > modified... [-Rvectorization] > foo.c.35:20: ...here > > or similar, where the user passed say "-Rvectorization" as a command > line option to request more info on vectorization, and our test suites > could do this. > > As a thought-experiment, consider that as well as cc1 etc, we could > have an executable for every pass. Then you could run individual > passes e.g.: > > $ run-vrp foo.gimple -o bar.gimple > $ run-switchconv quux.gimple -o baz.gimple > > etc. (I'm not convinced that it makes sense to split things up so > much, but I find it useful for inspiration, for getting ideas about the > things that we could do if we had that level of modularity, especially > from a testing perpective). yeah, though if you got rid of most / all of the other global state maybe it wouldn't be hard? but yeah it doesn't seem like the most important thing either. > FWIW I started looking at the existing gimple FE branch last week. It > implements a parser for a tuple syntax, rather than the C-like syntax. > > The existing parser doeesn't actually generate any gimple IR > internally, it just syntax-checks the input file. Building IR > internally seemed like a good next step, since I'm sure there are lots > of state issues to sort out. So I started looking at implementing a > testsuite/gimple.dg/roundtrip subdirectory: the idea is that this would > be full of gimple dumps; the parser would read them in, and then (with > a flag supplied by roundtrip.exp) would write them out, and > roundtrip.exp would compare input to output and require them to be > identical. I got as far as (partially) building a GIMPLE_ASSIGN > internally when parsing a file containing one. > > That said, I don't care for the tuple syntax in the existing gimple > dump format; I'd prefer a C-like syntax. agreed, and being compatable with the existing dumps suggests it too. > My thought was to hack up the existing gimple FE branch to change the > parser to accept a more C-like syntax, but... > > > A C-like syntax is prefered, a syntax that is also valid C would be > > even more prefered so that you can > > write "torture" testcases that have fixed IL into a specific pass but > > also run as regular testcases through > > the whole optimization pipeline. > > > > Piggy-backing on the C frontend makes it possible to leave all the > > details of types and declarations > > and global initializers as plain C while interpreting function bodies > > as "GIMPLE" when leaving the frontend. > > ...it sounds like you have a radically different implementation idea, > in which the gimple frontend effectively becomes part of the C > frontend, with some different behaviors. Well, it seems like if the existing gimple-fe is basically just a parser for a language we don't like there isn't much value in building off of it instead of writing something from scratch. Being compatable with C probably with some builtins to do SSA stuff seems pretty nice. I worry some about the work to avoid folding and stuff, but sharing code with the c-family languages seems good if we can. Trev > > > I expect that in the process of completing GIMPLE IL features you'll > > have to add a few GNU C extensions, > > mostly for features used by Ada (self-referential types come to my > > mind). > > > > I expect the first thing the project needs to do is add the "tooling" > > side, signalling the C frontend it > > should accept GIMPLE (add a -fgimple flag) plus adding a way to input > > IL into a specific pass > > (-ftest=<pass> or a function attribute so it affects only a specific > > function so you can write a testcase > > driver in plain C and have the actual testcase in a single function). > > The first actual frontend > > implementation challenge will then be emitting GIMPLE / CFG / SSA > > directly which I'd do in the > > "genericization" phase. Adjustments to how the C FE handles > > expressions should be made as well, > > for example I'd remove any promotions done, letting it only literally > > parse expressions. Maybe > > statement and expression parsing should be forked directly to not > > make > > the C FEs code too unwieldely > > but as said I'd keep type and decl parsing and its data structures as > > is. > > > > Eventually the dump file format used by GCCs GIMPLE dumps should be > > changed to be valid > > GIMPLE FE inputs (and thus valid C inputs). Adjustments mainly need > > to be done to basic-block > > labels and PHI nodes. > > > > I'd first not think about our on-the-side data too much initially > > (range info, points-to info, etc). > > > > Richard. > > Hope this is constructive > Dave