Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

John Kessenich Wed, 27 Aug 2014 11:53:37 -0700

Interesting points Jose.  It turns LLVM IR is an IR that works well for
both uses.  Slicing it a bit differently, if you were to look at just a
"binary language" (that is, not a "binary representation"), it is


a) the *language* is good for communicating between different layers

b) the *representation *is good for doing code transforms on

Some IRs are just languages, while others have rich internal
representations that make it easy to operate on them.

LLVM IR has both:  It has a language form for transport, and a
representation form for transforms. Regarding synergy, note that Khronos
has SPIR (LLVM-based IR) and announced at Siggraph a binary language for
GLSL.

Cheers,
JohnK



On Mon, Aug 18, 2014 at 9:47 AM, Jose Fonseca <jfons...@vmware.com> wrote:

> On 18/08/14 14:21, Marek Olšák wrote:
>
>> On Mon, Aug 18, 2014 at 2:44 PM, Roland Scheidegger <srol...@vmware.com>
>> wrote:
>>
>>> Am 16.08.2014 02:12, schrieb Connor Abbott:
>>>
>>>> I know what you might be thinking right now. "Wait, *another* IR? Don't
>>>> we already have like 5 of those, not counting all the driver-specific
>>>> ones? Isn't this stuff complicated enough already?" Well, there are some
>>>> pretty good reasons to start afresh (again...). In the years we've been
>>>> using GLSL IR, we've come to realize that, in fact, it's not what we
>>>> want *at all* to do optimizations on. Ian has done a talk at FOSDEM that
>>>> highlights some of the problems they've run into:
>>>>
>>>> https://urldefense.proofpoint.com/v1/url?u=https://video.
>>>> fosdem.org/2014/H1301_Cornil/Saturday/Three_Years_
>>>> Experience_with_a_Treelike_Shader_IR.webm&k=oIvRg1%
>>>> 2BdGAgOoM1BIlLLqw%3D%3D%0A&r=F4msKE2WxRzA%2BwN%
>>>> 2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0A&m=iXhCeAYmidPDc1lFo757Cc9V0PvWAN
>>>> 4n3X%2Fw%2B%2F7Lx%2Fs%3D%0A&s=f103fb26bf53eee64318a490517d1e
>>>> e9ab88ecd29fcdbe49d54b5a27e7581c2e
>>>>
>>>> But here's the summary:
>>>>
>>>> * GLSL IR is way too much of a memory hog, since it has to make a new
>>>> variable for each temporary the compiler creates and then each time you
>>>> want to dereference that temporary you need to create an
>>>> ir_dereference_variable that points to it which is also very
>>>> cache-unfriendly ("downright cache-mean!").
>>>>
>>>> * The expression trees were originally added so that we could do
>>>> pattern matching to automatically optimize things, but this turned out
>>>> to be both very difficult to do and not very helpful. Instead, all it
>>>> does is add more complexity to the IR without much benefit - with SSA or
>>>> having proper use-def chains, we could get back what the trees give us
>>>> while also being able to do lots more optimizations.
>>>>
>>>> * We don't have the concept of basic blocks in GLSL IR, which makes a
>>>> lot of optimizations harder because they were originally designed with
>>>> basic blocks in mind - take, for example, my SSA series. I had to map a
>>>> whole lot of concepts that were based on the control flow graph to this
>>>> tree of statements that GLSL IR uses, and the end result wound up
>>>> looking nothing at all like the original paper. This problem gets even
>>>> worse for things like e.g. Global Code Motion that depend upon having
>>>> the dominance tree.
>>>>
>>>> I originally wanted to modify GLSL IR to fix these problems by adding
>>>> new instruction types that would address these issues and then
>>>> converting back and forth between the old and the new form, but I
>>>> realized that fixing all the problems would basically mean a complete
>>>> rewrite - and if that's the case, then why don't we start from scratch?
>>>> So I took Ken's suggestions and started designing, and then at Intel
>>>> over the summer started implementing, a completely new IR which I call
>>>> NIR that's at a lower level than GLSL IR, but still high-level enough to
>>>> be mostly device-independant (different drivers may have different
>>>> passes and different ways of lowering e.g.  matrix multiplies) so that
>>>> we can do generic optimizations on it. Having support for SSA from the
>>>> beginning was also a must, because lots of optimisations that we really
>>>> want for cleaning up DX9-translated games are either a lot easier in or
>>>> made possible by SSA. I also made the decision for it to be typeless,
>>>> because that's what the cool kids are all doing :) and for a
>>>> lower-level, flat IR it seemed like the thing to do (it could have gone
>>>> either way, though). So the key design points of NIR (pronounced either
>>>> like "near" as in "NIR is near!" or to rhyme with "burr") are:
>>>>
>>>> * It's flat (no expression trees)
>>>>
>>>> * It's typeless
>>>>
>>>> * Modifiers (abs, negate, saturate), swizzles, and write masks are part
>>>> of ALU instructions
>>>>
>>>> * It includes enough GLSL-like things (variables that you can load from
>>>> or store to, function calls) to be hardware-agnostic (although we don't
>>>> have a way to represent matrix multiplies right now, but that could
>>>> easily be added) to be able to do optimizations at a high level, while
>>>> having lowering passes that convert variables to registers and
>>>> input/output/uniform loads/stores that will open up more opportunities
>>>> for optimization and save memory while being more hardware-specific.
>>>>
>>>> * Control flow consists of a tree of if statements and loops, like in
>>>> GLSL IR, except the leaves of the tree are now basic blocks instead of
>>>> instructions. Also, each basic block keeps track of its successors and
>>>> predecessors, so the control flow graph is explicit in the IR.
>>>>
>>>> * SSA is natively supported, and SSA uses point directly to the SSA
>>>> definition, which means that the use-def chains are always there, and
>>>> def-use chains are kept by tracking the set of all uses for each
>>>> definition.
>>>>
>>>> * It's written in C.
>>>>
>>>> (see the README in patch 3 and nir.h in patch 4 for more details)
>>>>
>>>> Some things that are missing or could be improved:
>>>>
>>>> * There's currently no alias tracking for inputs, outputs, and uniforms.
>>>> This is especially important for uniforms because we don't pack them
>>>> like we pack inputs and outputs.
>>>>
>>>> * We need a way to represent matrix multiplies so that we can do
>>>> matrix-flipping optimizations in NIR (currently GLSL IR does this for
>>>> us).
>>>>
>>>> * I'm not entirely happy about how we represent loads and stores in the
>>>> IR. Right now, they're intrinsics, but that means we need a different
>>>> intrinsic for each size and combination of arguments (indirect vs. not
>>>> indirect, etc.) and we might run into a combinatorial explosion problem
>>>> in the future, so we might need to make separate load/store instructions
>>>> like what I did for textures.
>>>>
>>>> * Right now, we only have a pass that lowers variables for scalar
>>>> backends. We need to write a similar pass for vector backends that uses
>>>> std140 packing or something similar, as well as porting
>>>> lower_ubo_reference to NIR and changing it to output offsets in the
>>>> hardware-native units instead of bytes.
>>>>
>>>> * We'll need to write a pass that splits up vector expressions for
>>>> scalar backends.
>>>>
>>>
>>>
>>  [...]
>
>
> > However, let's face it, gallium is stuck with TGSI
>
>> forever. Switching to another IR in Gallium is insane (unless you can
>> rewrite all drivers and state trackers for it - let's be realistic, it
>> just won't happen). The next GL NG IR, whatever it is going to be,
>> will be just as important as the IR of ARB_vertex_program. TGSI will
>> continue to be the major IR whether we like or not.
>>
>
>
>
> No, switching to another IR in Gallium is not insane if approached the
> right way.   We already allow multiple IRs in gallium, so all it take to
> move to another IR is to having helper modules to do the translation:
>
> - a pipe driver helper module that would translate new IR into TGSI, for
> the sake of old pipe drivers
>
>
> - a state tracker helper module that would translate TGSI into the new IR,
> for the sake of old state trackers.
>
>
> Once these are in place, all development effort to go on to
> improving/leveraging the new IR.  We could deprecate TGSI when it would
> have few users.
>
>
>
>
> I also want to highlight there are two kinds of "IR".
>
>
> a) one thing is a shader IR that communicates a shader between an
> interface (be it application interface
>
>        High-level lang.             IR               GPU code
>   App -----------------> front-end ----> back-end ---------->  GPU
>
> b) another is a shader IR that is meant to faciliate code transformations
> (ie optimizations):
>
>       opt. pass     opt. pass
>    IR ---------> IR ---------> IR --> ....
>
>
> Gallium needs a), but not necessarily b).  An optimizing compiler needs b)
> internally but necessarily a).
>
> An IR that achieves both a) and b) is not impossible, but it is a more
> difficult trade-off.
>
>
> My point is: it's OK to use a different IR in Gallium interface, provided
> that the IR used in Gallium's interface doesn't lose information any
> information.
>
> On the other hand, there is a lot of momentum behind LLVM "inspired" IRs,
> like SPIR.  So there would probably be alot of synergy if LLVM became
> Gallium's "standard" IR.
>
>
> Jose
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 00/16] A new IR for Mesa

Reply via email to