On Tue, Mar 08, 2016 at 05:12:56PM -0500, Diego Novillo wrote: > On Tue, Mar 8, 2016 at 4:59 PM, David Malcolm <dmalc...@redhat.com> wrote: > > > My goal for unit-testing passes is to be able to dump/reload the GIMPLE > > IR in a form that's: > > (A) readable by both humans and programs, and > > (B) editable by humans > > (C) roundtrippable for some subset of the IR > > (D) can support the input for every gimple pass (pre-CFG, CFG-before > > -SSA, CFG-with-SSA, with/without lowered switches etc) > > (E) can support the output of every gimple pass, apart from the final > > expansion to RTL. > > > > AIUI, Richard would also like: > > (F) the form to be parsable as C > > (presumably some subset) > > > > LLVM IR is likely similar to GIMPLE IR, but AFAIK, LLVM IR requires > > SSA, which would seem to preclude using it (goals (D) and (E) above). > > LLVM IR is an SSA IR, yes. It also has two different representations, > a text-based one parseable with its front end, and a binary one > (bitcode) which is more efficient for purposes of LTO and such > (similar to the GIMPLE bytecode lto front end). > > LLVM IR is actually lower in the IR abstraction spectrum. It's closer > to RTL than it is to GIMPLE. For instance, its type system is very > similar to RTL: words, pointers and offsets. A few machine features > creep in, but not many. Things like function calls look very much > like a C function call. > > > > Also, there may be other things we'd want to express in GIMPLE that > > might not be directly expressible in LLVM IR (Richard alluded to this > > earlier in this thread: the on-the-side data: range info, points-to > > info, etc). Though I suspect converters may be feasible for some > > common subset of SSA IR. > > > > Regarding goal (F) above, AFAIK, LLVM IR has a texual assembly form and > > a bitcode form; does LLVM IR have a subset-of-C form? > > Well, in the sense that it kinda looks like a very low level version > of C. For instance, > > int sum(int y, int x) { return x + y; } > > becomes: > > define i32 @sum(i32 %y, i32 %x) #0 { > entry: > %y.addr = alloca i32, align 4 > %x.addr = alloca i32, align 4 > store i32 %y, i32* %y.addr, align 4 > store i32 %x, i32* %x.addr, align 4 > %0 = load i32, i32* %x.addr, align 4 > %1 = load i32, i32* %y.addr, align 4 > %add = add nsw i32 %0, %1 > ret i32 %add > } > > Notice all the word operations and the SSA nature of the IL itself. > > > I'm hardly in a position to offer guidance here, but I'm not sure that > overloading the gimple FE on the c-family FE is desirable long term.
I'm kind of on the fence there. > The two parsers are going to invariably differ in what they want to > accept, error messages, etc. For modularity purposes, having a I'm not sure they really will so far as types are concerned. As far as expressions and statements go they probably should be different. > separate module dealing with GIMPLE itself might be better than > piggybacking functionality on the C FE. My concern is that writing a separate gimple parser will require basically a copy of all the code to parse decls and types, but maybe that isn't that much? > This way, implementing a library that supports dealing with GIMPLE > becomes much simpler. This provides a nice foundation for all kinds > of gimple-oriented tooling in the future. Well, one nice thing about choosing a subset of C as your textual representation of gimple is that all the tools that deal with C already can deal with it, and so you won't really need separate tools for gimple (that would be my theory anyway). Trev > > > > Diego.