Interesting points Jose. It turns LLVM IR is an IR that works well for both uses. Slicing it a bit differently, if you were to look at just a "binary language" (that is, not a "binary representation"), it is
a) the *language* is good for communicating between different layers b) the *representation *is good for doing code transforms on Some IRs are just languages, while others have rich internal representations that make it easy to operate on them. LLVM IR has both: It has a language form for transport, and a representation form for transforms. Regarding synergy, note that Khronos has SPIR (LLVM-based IR) and announced at Siggraph a binary language for GLSL. Cheers, JohnK On Mon, Aug 18, 2014 at 9:47 AM, Jose Fonseca <jfons...@vmware.com> wrote: > On 18/08/14 14:21, Marek Olšák wrote: > >> On Mon, Aug 18, 2014 at 2:44 PM, Roland Scheidegger <srol...@vmware.com> >> wrote: >> >>> Am 16.08.2014 02:12, schrieb Connor Abbott: >>> >>>> I know what you might be thinking right now. "Wait, *another* IR? Don't >>>> we already have like 5 of those, not counting all the driver-specific >>>> ones? Isn't this stuff complicated enough already?" Well, there are some >>>> pretty good reasons to start afresh (again...). In the years we've been >>>> using GLSL IR, we've come to realize that, in fact, it's not what we >>>> want *at all* to do optimizations on. Ian has done a talk at FOSDEM that >>>> highlights some of the problems they've run into: >>>> >>>> https://urldefense.proofpoint.com/v1/url?u=https://video. >>>> fosdem.org/2014/H1301_Cornil/Saturday/Three_Years_ >>>> Experience_with_a_Treelike_Shader_IR.webm&k=oIvRg1% >>>> 2BdGAgOoM1BIlLLqw%3D%3D%0A&r=F4msKE2WxRzA%2BwN% >>>> 2B25muztFm5TSPwE8HKJfWfR2NgfY%3D%0A&m=iXhCeAYmidPDc1lFo757Cc9V0PvWAN >>>> 4n3X%2Fw%2B%2F7Lx%2Fs%3D%0A&s=f103fb26bf53eee64318a490517d1e >>>> e9ab88ecd29fcdbe49d54b5a27e7581c2e >>>> >>>> But here's the summary: >>>> >>>> * GLSL IR is way too much of a memory hog, since it has to make a new >>>> variable for each temporary the compiler creates and then each time you >>>> want to dereference that temporary you need to create an >>>> ir_dereference_variable that points to it which is also very >>>> cache-unfriendly ("downright cache-mean!"). >>>> >>>> * The expression trees were originally added so that we could do >>>> pattern matching to automatically optimize things, but this turned out >>>> to be both very difficult to do and not very helpful. Instead, all it >>>> does is add more complexity to the IR without much benefit - with SSA or >>>> having proper use-def chains, we could get back what the trees give us >>>> while also being able to do lots more optimizations. >>>> >>>> * We don't have the concept of basic blocks in GLSL IR, which makes a >>>> lot of optimizations harder because they were originally designed with >>>> basic blocks in mind - take, for example, my SSA series. I had to map a >>>> whole lot of concepts that were based on the control flow graph to this >>>> tree of statements that GLSL IR uses, and the end result wound up >>>> looking nothing at all like the original paper. This problem gets even >>>> worse for things like e.g. Global Code Motion that depend upon having >>>> the dominance tree. >>>> >>>> I originally wanted to modify GLSL IR to fix these problems by adding >>>> new instruction types that would address these issues and then >>>> converting back and forth between the old and the new form, but I >>>> realized that fixing all the problems would basically mean a complete >>>> rewrite - and if that's the case, then why don't we start from scratch? >>>> So I took Ken's suggestions and started designing, and then at Intel >>>> over the summer started implementing, a completely new IR which I call >>>> NIR that's at a lower level than GLSL IR, but still high-level enough to >>>> be mostly device-independant (different drivers may have different >>>> passes and different ways of lowering e.g. matrix multiplies) so that >>>> we can do generic optimizations on it. Having support for SSA from the >>>> beginning was also a must, because lots of optimisations that we really >>>> want for cleaning up DX9-translated games are either a lot easier in or >>>> made possible by SSA. I also made the decision for it to be typeless, >>>> because that's what the cool kids are all doing :) and for a >>>> lower-level, flat IR it seemed like the thing to do (it could have gone >>>> either way, though). So the key design points of NIR (pronounced either >>>> like "near" as in "NIR is near!" or to rhyme with "burr") are: >>>> >>>> * It's flat (no expression trees) >>>> >>>> * It's typeless >>>> >>>> * Modifiers (abs, negate, saturate), swizzles, and write masks are part >>>> of ALU instructions >>>> >>>> * It includes enough GLSL-like things (variables that you can load from >>>> or store to, function calls) to be hardware-agnostic (although we don't >>>> have a way to represent matrix multiplies right now, but that could >>>> easily be added) to be able to do optimizations at a high level, while >>>> having lowering passes that convert variables to registers and >>>> input/output/uniform loads/stores that will open up more opportunities >>>> for optimization and save memory while being more hardware-specific. >>>> >>>> * Control flow consists of a tree of if statements and loops, like in >>>> GLSL IR, except the leaves of the tree are now basic blocks instead of >>>> instructions. Also, each basic block keeps track of its successors and >>>> predecessors, so the control flow graph is explicit in the IR. >>>> >>>> * SSA is natively supported, and SSA uses point directly to the SSA >>>> definition, which means that the use-def chains are always there, and >>>> def-use chains are kept by tracking the set of all uses for each >>>> definition. >>>> >>>> * It's written in C. >>>> >>>> (see the README in patch 3 and nir.h in patch 4 for more details) >>>> >>>> Some things that are missing or could be improved: >>>> >>>> * There's currently no alias tracking for inputs, outputs, and uniforms. >>>> This is especially important for uniforms because we don't pack them >>>> like we pack inputs and outputs. >>>> >>>> * We need a way to represent matrix multiplies so that we can do >>>> matrix-flipping optimizations in NIR (currently GLSL IR does this for >>>> us). >>>> >>>> * I'm not entirely happy about how we represent loads and stores in the >>>> IR. Right now, they're intrinsics, but that means we need a different >>>> intrinsic for each size and combination of arguments (indirect vs. not >>>> indirect, etc.) and we might run into a combinatorial explosion problem >>>> in the future, so we might need to make separate load/store instructions >>>> like what I did for textures. >>>> >>>> * Right now, we only have a pass that lowers variables for scalar >>>> backends. We need to write a similar pass for vector backends that uses >>>> std140 packing or something similar, as well as porting >>>> lower_ubo_reference to NIR and changing it to output offsets in the >>>> hardware-native units instead of bytes. >>>> >>>> * We'll need to write a pass that splits up vector expressions for >>>> scalar backends. >>>> >>> >>> >> [...] > > > > However, let's face it, gallium is stuck with TGSI > >> forever. Switching to another IR in Gallium is insane (unless you can >> rewrite all drivers and state trackers for it - let's be realistic, it >> just won't happen). The next GL NG IR, whatever it is going to be, >> will be just as important as the IR of ARB_vertex_program. TGSI will >> continue to be the major IR whether we like or not. >> > > > > No, switching to another IR in Gallium is not insane if approached the > right way. We already allow multiple IRs in gallium, so all it take to > move to another IR is to having helper modules to do the translation: > > - a pipe driver helper module that would translate new IR into TGSI, for > the sake of old pipe drivers > > > - a state tracker helper module that would translate TGSI into the new IR, > for the sake of old state trackers. > > > Once these are in place, all development effort to go on to > improving/leveraging the new IR. We could deprecate TGSI when it would > have few users. > > > > > I also want to highlight there are two kinds of "IR". > > > a) one thing is a shader IR that communicates a shader between an > interface (be it application interface > > High-level lang. IR GPU code > App -----------------> front-end ----> back-end ----------> GPU > > b) another is a shader IR that is meant to faciliate code transformations > (ie optimizations): > > opt. pass opt. pass > IR ---------> IR ---------> IR --> .... > > > Gallium needs a), but not necessarily b). An optimizing compiler needs b) > internally but necessarily a). > > An IR that achieves both a) and b) is not impossible, but it is a more > difficult trade-off. > > > My point is: it's OK to use a different IR in Gallium interface, provided > that the IR used in Gallium's interface doesn't lose information any > information. > > On the other hand, there is a lot of momentum behind LLVM "inspired" IRs, > like SPIR. So there would probably be alot of synergy if LLVM became > Gallium's "standard" IR. > > > Jose > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev