On Tue, Mar 8, 2016 at 4:59 PM, David Malcolm <dmalc...@redhat.com> wrote:

> My goal for unit-testing passes is to be able to dump/reload the GIMPLE
> IR in a form that's:
>   (A) readable by both humans and programs, and
>   (B) editable by humans
>   (C) roundtrippable for some subset of the IR
>   (D) can support the input for every gimple pass (pre-CFG, CFG-before
> -SSA, CFG-with-SSA, with/without lowered switches etc)
>   (E) can support the output of every gimple pass, apart from the final
> expansion to RTL.
>
> AIUI, Richard would also like:
>   (F) the form to be parsable as C
> (presumably some subset)
>
> LLVM IR is likely similar to GIMPLE IR, but AFAIK, LLVM IR requires
> SSA, which would seem to preclude using it (goals (D) and (E) above).

LLVM IR is an SSA IR, yes.  It also has two different representations,
a text-based one parseable with its front end, and a binary one
(bitcode) which is more efficient for purposes of LTO and such
(similar to the GIMPLE bytecode lto front end).

LLVM IR is actually lower in the IR abstraction spectrum.  It's closer
to RTL than it is to GIMPLE.  For instance, its type system is very
similar to RTL: words, pointers and offsets.  A few machine features
creep in, but not many.  Things like function calls look very much
like a C function call.


>  Also, there may be other things we'd want to express in GIMPLE that
> might not be directly expressible in LLVM IR (Richard alluded to this
> earlier in this thread: the on-the-side data: range info, points-to
> info, etc).   Though I suspect converters may be feasible for some
> common subset of SSA IR.
>
> Regarding goal (F) above, AFAIK, LLVM IR has a texual assembly form and
> a bitcode form; does LLVM IR have a subset-of-C form?

Well, in the sense that it kinda looks like a very low level version
of C.  For instance,

int sum(int y, int x) { return x + y; }

becomes:

define i32 @sum(i32 %y, i32 %x) #0 {
entry:
  %y.addr = alloca i32, align 4
  %x.addr = alloca i32, align 4
  store i32 %y, i32* %y.addr, align 4
  store i32 %x, i32* %x.addr, align 4
  %0 = load i32, i32* %x.addr, align 4
  %1 = load i32, i32* %y.addr, align 4
  %add = add nsw i32 %0, %1
  ret i32 %add
}

Notice all the word operations and the SSA nature of the IL itself.


I'm hardly in a position to offer guidance here, but I'm not sure that
overloading the gimple FE on the c-family FE is desirable long term.
The two parsers are going to invariably differ in what they want to
accept, error messages, etc.  For modularity purposes, having a
separate module dealing with GIMPLE itself might be better than
piggybacking functionality on the C FE.

This way, implementing a library that supports dealing with GIMPLE
becomes much simpler.  This provides a nice foundation for all kinds
of gimple-oriented tooling in the future.



Diego.

Reply via email to