On Wed, May 28, 2014 at 8:50 PM, Eric Anholt <e...@anholt.net> wrote: > Connor Abbott <cwabbo...@gmail.com> writes: > >> On Wed, May 28, 2014 at 2:37 PM, Eric Anholt <e...@anholt.net> wrote: >>> Here's a series I started back in January as a little experiment. >>> Basically, I feel guilty for pushing GLSL IR into the driver, and wish I'd >>> just fixed up Mesa IR back in the day. But, given that we're still >>> feeding Mesa IR into drivers as well (ARB programs and fixed function >>> vertex programs), it made me think: What if I fixed it up now, and got >>> Mesa IR to the point that we could just garbage collect the GLSL IR input >>> paths to drivers? >>> >>> Mesa IR has a bunch of weaknesses that need to get sorted out if it's >>> going to be useful: >>> >>> - It's a single giant array of instructions, making modifications of the >>> instruction stream (instruction lowering, optimization, etc.) more >>> expensive in code and CPU time than it should be. >>> - It doesn't have any variable declarations, so if you have dynamic array >>> indexing, optimization just shuts down (plus, no annotation on the >>> temps, so debugging is irritating). >>> - It doesn't have integer instructions or anything else post-GLSL-1.30. >>> - The optimization passes for it are totally ad-hoc and fairly weak. >>> - It's not SSA. >>> >>> I'm interested in fixing all of these. How do people feel about this >>> goal? >> >> I don't know much about Mesa IR, but this seems like a huge pile of >> work... is this any easier than just making GLSL IR flatter, like with >> my flatland idea that I posted a while ago? Also, iirc, there was an >> effort a while ago to make ARB programs go through GLSL IR but it was >> abandoned - why was it so difficult? I'm skeptical that basically >> bringing an older IR from 1999 to 2014 is going to be less effort than >> modifying a newer one that had some good ideas and some bad ones. > > The thing is that I want basically all the ways that GLSL IR is > different from Mesa IR to go back to Mesa IR, except for size > declarations on temps. It was a mistake. From there I want new stuff, > but building new stuff doesn't feel that hard when it's incremental > changes.
Well, I think that in my "perfect IR" we'd still have some parts of GLSL IR. I'd have the if statements and loops in a tree structure (more on that below), plus I think that the ir_variable stuff is useful for a few things: 1. Global variables, including uniforms, ins and outs where we have to match names and types 2. Function parameters where we have to do the same thing 3. Arrays and structures Then we would have instructions for loading and storing ir_variables (or for arrays and structures, a single vector in the ir_variable), which later on would be lowered UBO-style. All instructions would be in SSA form, with each use pointing directly to the corresponding definition; we could introduce registers for backends that don't support SSA but the majority of the compiler would ignore that. The AST -> IR code would output SSA instructions where it would now output expressions, and then after structure splitting we do the conversion to SSA, which also acts as a pass that eliminates ir_variables. Now *this* is something that can be easily done incrementally, since the new stuff would very easily interoperate with the old stuff; it would be easy to write a pass that converts between the old & new form. > > i965 faces a non-incremental change if it was to choose to follow this > path: flipping everything over to brw_*_[vf]p.cpp. That doesn't look > pleasant, and it has some challenges (like its MOV-happy swizzle > handling in brw_fs_fp.cpp). But, as Ken pointed out, it would probably be a lot more work to bring the Mesa IR path up to par with the current GLSL IR path... > >> By the way, correct me if I'm wrong, but I think you're still missing >> a bullet point there. You've changed Mesa IR to be a list of >> instructions, but it's still just one list. Ideally, you'd have both >> the graph structure (the basic blocks) and the tree structure (if >> statements and loops) available in the IR, since both of them are >> going to be needed for different reasons, especially with converting >> to SSA (or doing basically any non-trivial optimization). As of now, >> GLSL IR has the former (I think Ian had some patches to introduce the >> latter through an analysis pass as part of an experiment with >> introducing def-use chains to GLSL IR...) but Mesa IR has neither - in >> the end, I think it's much more practical to have a tree of if >> statements and loops like in GLSL IR than a simple list of >> instructions. Again, I don't think it's less effort than just fixing >> GLSL IR. > > We absolutely need a CFG handy to do optimization well, but I disagree > on having the tree structure in the IR for flow-control constructs. I > recently wrote a CFG constructor for GLSL IR, and found that trying to > correctly walk the tree (particularly for dataflow analysis, but also to > an extent for just building a CFG) was way harder than it was in i965's > IR and I definitely don't trust the resulting code anything like I do > for the i965 stuff (you can find that work on the deadcode branch of my > tree). I don't see how it was that much harder. You guys basically already do the exact same thing you did (walking the tree to build a CFG) for i965 in brw_cfg.cpp; in fact, you can basically copy-n-paste the logic there to GLSL IR to do what your live variables code did. In the end, a flat list of instructions has no advantage over a tree because everything you can do with the list, you can automatically do with the tree in an equivalent manner - if you wanted to walk the list forwards or backwards, just visit the tree in a depth-first manner, something that the visitor classes let you do very easily already. With a tree, however, oftentimes you can write things in a more natural recursive way. Take, for example, your example with figuring out the CFG. In GLSL IR, you could do something like: visit(ir_if *) { struct basic_block *then_bb = ... then_bb->first_instr = first instr of then (unless it's a control-flow statemtent) link this->cur_bb to then_bb struct basic_block *else_bb = NULL; if (else isn't empty) { else_bb = ... else_bb->first_instr = first instr of else (unless it's a control-flow statement) link this->cur_bb to else_bb } struct basic_block *exit_bb = ... exit_bb->first_instr = first instr after if this->cur_bb = then_bb visit then statements... link this->cur_bb and exit_bb if (else isn't empty) { this->cur_bb = else_bb visit else statements... link this->cur_bb and exit_bb } this->cur_bb = exit_bb } visit(ir_loop *) { struct basic_block *old_entry_bb = this->entry_bb struct basic_block *old_exit_bb = this->exit_bb this->entry_bb = ... this->entry_bb->first_instr = first instr of loop link this->cur_bb to this->entry_bb this->exit_bb = ... this->exit_bb->first_instr = first instr after loop this->cur_bb = this->entry_bb visit loop statements... this->cur_bb = this->exit_bb this->entry_bb = old_entry_bb this->exit_bb = old_exit_bb } visit(ir_loop_jump *) { this->cur_bb->last_instr = this instr if (is break) link this->cur_bb and this->exit_bb else link this->cur_bb and this->entry_bb } visit(ir_assignment or ir_call) { this->cur_bb->last_instr = this instr } This is pseudocode, but it's pretty close to what I would actually write. It does what the big switch in brw_cfg.cpp does in about half the code (150 lines vs. ~70-80 lines), plus it doesn't require the makeshift stacks to store basic blocks. Also, it's probably more efficient since it uses the system stack which doesn't have to call malloc() every time something gets pushed on it. And beauty is in the eye of the beholder, but I think that I can trust this code more than the i965 code (after it's been tested, of course), since rather than hiding the recursive nature of what you're doing, it puts it front and center. Another way to think about it is that even with a flat list of instructions, you always have a tree structure - the IF, ELSE, ENDIF and WHILE and DO match up to form a tree - but every time you want to recurse through the tree, you wind up having to manage state like the entry and exit blocks of the loop by hand instead of simply letting the compiler do it, sort of like rolling your own state machine vs. using coroutines. So thanks for helping me with an example to prove my point :) _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev