On Wed, Mar 9, 2016 at 3:27 PM, Andrew MacLeod <amacl...@redhat.com> wrote: > On 03/07/2016 11:33 AM, David Malcolm wrote: >>> >>> >>> >>> >>> >>> So for testing specific passes, I'd much rather have an input format >>> for testing individual passes that: >>> * can be easily generated by GCC from real test cases >>> * ability to turn into unit tests, which implies: >>> * human-readable and editable >>> * compatibility and stability: can load gcc 7 test cases into gcc 8; >>> have e.g. gcc 6 generate test cases >>> * roundtrippable: can load the format, then save out the IR, and get >>> the same file, for at least some subset of the format (consider e.g. >>> location data: explicit location data can be roundtripped, implicit >>> location data not so much). >>> >>> ...which suggests that we'd want to use gimple dumps as the input >>> format to a test framework - which leads naturally to the idea of a >>> gimple frontend. > > > > We already read and write gimple IL in LTO, we just do it in binary form. I > think the kind of effort you are talking about here is best placed in > attaching a gimple parser to LTO, thus giving LTO the ability to read and > write textual gimple as well as the current binary form. The current > dump format could in theory be a starting point, but its clearly missing > hunks of stuff. there is probably a better representation. > > LTO already knows all the bits required to reconstruct gimple. The > definition of the textual representation can make intelligent choices about > defaults so that you don't have to specify every single bit in the textual > form that the binary form requires. ThIs seems far easier to me than > starting with the incomplete form that the current dumps generate and trying > to discover what other bits need to be added to properly reconstruct the IL. > I think its hard to get a lot of the subtle things right. I also think > the scope of defining and reading/writing should be relatively manageable. > We can optimize the details once its working. > > It would also be very useful then to have LTO enhanced so that it can read > and write before or after any pass... Then we can unit test any pass by > injecting the IL immediately before the pass.. No jumping through any hoops > to make sure the pass you care about sees the exact IL you want.. That is > also a good proof that the LTO form (both binary and text) does fully > represent gimple. We can also use this output as our debugging dumps and > archive the current dumper. > > As gimple changes and evolves the result is only one place to worry about > for reading and writing... and as we progress (slowly) towards uncoupling > the middle/backend from the front ends, we'd have a single well defined > "front end" for gimple that accepts binary or text.
So I chose to reply to this one (and will refrain from replying to other but try to address comments there). First, while the LTO approach works it's quite overkill in the details it "dumps" and thus it's too closely tied to our internal bits which means testcases will bitrot too quickly for the number one goal of having human maintainable testcases. It's nice if there's going to be somebody spending quite some of his work-time towards unit-testing (hope not specifically "the GIMPLE frontend"). In my view the C frontend already can target most of the middle-end features and for those it can't it should be straight-forward to add GNU extensions for. A critical piece is of course SSA here, specifically PHIs. I think a reasonable way to express those in C are to use labels: int i; if (...) { L1: i_1 = 2; } else { L2:; } i_3 = __PHI (L1:i_1, L2:i); so the testcases would be valid GNU C (not C). What would be missing for unit-testing would be some "raw" mode to avoid having the C FE fold things or apply type promotions (so you can actually write a signed short addition). As of restricting statements to GIMPLE I think that's not necessary - I'd simply run the GENERIC from the FE through the gimplifier (I have patches that deal with SSA pre into-SSA just fine, at least for non-PHIs, and if all the __PHI above could be just an internal function pre "real" SSA). Note that I don't think we should restrict ourselves by connecting what LTO does with what the requirements for unit testing are. The convenient bit of extending the C FE here is that dumping a function body in the required form is going to be easy and that you can have a testcase harness in plain C while feeding in a unit-test function as "GNU C GIMPLE" (or how you'll call it). Say, extern void abort (void); int x; int __attribute__((GIMPLE)) foo () { int _1; _1 = x; return _1; } int main() { if (foo () != 1) abort (); return 0; } and the above would extend to __attribute__((RTL)) if anybody wants to introduce that. Give 'GIMPLE' an argument like __attribute__((GIMPLE("tree-pre"))) to specify the place to inject the function [you still have to feed it to the cgraph from the beginning of course, but the pass manager would skip anything before tree-pre for example but still eventually compute IL side-data via required PROP_s] Yes, a textual form for LTO data would be nice (or rather a self-descriptive LTO data format so you can have external tools dump it). But I don't think using the LTO dumper will work for unit testing. About using the LLVM IR - similar issue I think, plus it is probably too far away from GCC so that what we'll end up will only look like LLVM IR but not actually be LLVM IR. I think with sticking to C and re-using (parts of) the frontend the path to first "success" can be much shorter which I think is important for the project to not bitrot in an unusable state like the last attempt. Of course while I can spend some cycles mentoring a GSoC student I won't spend a significant fraction of my work time on this project. Richard. > Andrew >