On Fri, Feb 07, 2014 at 10:49:01AM -0500, Alex Deucher wrote: > On Fri, Feb 7, 2014 at 12:34 AM, Connor Abbott <cwabbo...@gmail.com> wrote: > > Hi, > > > > So I believe that we can all agree that the tree-based representation > > that GLSL IR currently uses for shaders needs to go. For the benefit > > of those that didn't watch Ian Romanick's talk at FOSDEM, I'll > > reiterate some of the problems with it as of now: > > > > - All the ir_dereference chains blow up the memory usage, and the > > constant pointer chasing in the recursive algorithms needed to handle > > them is not just cache-unfriendly but "cache-mean." > > > > - The ir_hierachical_visitor pattern that we currently use for > > optimization/analysis passes has to examine every piece of IR, even > > the irrelevant stuff, making the above problems even worse. > > > > - Nobody else does it this way, meaning that the existing well-known > > optimizations don't apply as much here, and oftentimes we have to > > write some pretty nasty code in order to make necessary optimizations > > (like tree grafting). > > > > - It turns out that the original advantage of a tree-based IR, to be > > able to automatically generate pattern-matching code for optimizing > > certain code patterns, only really matters for CPU's with weird > > instruction sets with lots of exotic instructions; GPU's tend to be > > pretty regular and consistent in their ISA's, so being able to > > pattern-match with trees doesn't help us much here. > > > > Finally, it seems like a lot of important SSA-based passes assume that > > we have a flat IR, and so moving to SSA won't be nearly as beneficial > > as we would like it to be; we could flatten the IR before doing these > > passes, but that would make the first problem even worse. So we can't > > really take advantage of SSA too much either until we have a flat IR. > > > > The real issue is, how do we let this transition occur gradually, in > > pieces, without breaking existing code? Ian proposed one solution at > > FOSDEM, but here's my idea of another. > > > > So, my idea is that rather than slowly introducing changes across the > > board, we create the IR in its final form in the beginning, write > > passes to flatten and unflatten the IR, and then piece-by-piece > > rewrite the rest of the compiler. We're going to have to rewrite a lot > > of the passes to support SSA in the first place, so why not convert > > them to a flat IR while we're at it? The benefit of this is that it's > > much easier to do asynchronously and in parallel; rather than > > introducing changes to the entire thing at once, several people can > > convert this and that pass, the frontend, the linker, etc. > > independently. It would entail some extra overhead during the > > transition in the form of the flattening and unflattening passes, but > > I think it would be worth it for the immediate benefits (optimizations > > like GVN-GCM and CSE made possible, etc.). > > > > The first part to be converted would be my passes to convert to and > > from SSA, so that the compiler optimization part would look like this: > > > > flatten -> convert to SSA -> (the new hotness) -> out of SSA -> > > unflatten -> (the old stuff) > > > > Then we gradually convert ast_to_hir, various passes, the linker, > > backends, etc. to this form while now actually having the > > infrastructure to implement any advanced compiler optimization > > designed in the last ~15 years or so by more-or-less copying down the > > pseudocode. Hopefully, then, we can reach a point where we can rip out > > the old IR and the converters. > > > > So what would this new IR look like? Well, here's my 2 cents (in the > > form of some abridged class definitions, you should get the point...) > > > > struct ir_calc_source > > { > > mode; /** < SSA or non-SSA */ > > union { > > ir_calculation *def; /** < for SSA sources */ > > unsigned int reg; /** < for non-SSA sources */ > > } src; > > unsigned swizzle : 8; > > }; > > > > struct ir_calc_dest > > { > > mode; /** < SSA or non-SSA */ > > union { > > unsigned int reg; /** < for non-SSA destinations */ > > > > /** > > * For SSA destinations. Types are needed here because > > normally they're part > > * of the register, but SSA doesn't have registers. > > */ > > glsl_type *type; > > } reg_or_type; /* this name is kinda ugly but couldn't think of > > anything better. */ > > }; > > > > /* > > * This is Ian's name for it, personally I would vote for > > s/ir_instruction/ir_node/ and > > * call this ir_instruction > > */ > > > > class ir_calculation > > { > > ir_calc_dest dest; > > ir_expression_operation op; > > unsigned write_mask : 4; > > ir_calc_source srcs[4]; > > }; > > > > class ir_load_var > > { > > ir_calc_dest dest; > > ir_variable *src; > > > > /** > > * For array and record loads, whether we're loading a specific > > member or the whole > > * thing. > > */ > > bool deref_member; > > ir_calc_source array_index; /** < for array loads if > > deref_array_index is true */ > > char *record_index; /** < for structure loads */ > > }; > > > > class ir_store_var > > { > > ir_variable *dest; > > ir_calc_source src; > > bool deref_member; > > ir_calc_source array_index; /** < for array loads */ > > char *record_index; /** < for structure loads */ > > unsigned write_mask : 4; > > }; > > > > So ir_variable still exists, but it will only be used for function > > parameters, shader in/outs and uniforms, and arrays and structures. > > Registers will be much more lightweight, only requiring a table with > > each register's type and perhaps uses and definitions. The flattening > > pass, and later ast_to_hir, will emit loads and stores wherever there > > is an ir_dereference now, but there will be an ir_variable -> register > > pass that converts these to moves that will later be eliminated by > > copy propagation (in SSA form, after converting the registers to SSA > > writes). This is similar to how LLVM works, with everything starting > > out allocated on the stack using alloca (equivalent to ir_variables > > here) and accessed explicitly using loads and stores, but then some of > > these loads/stores are optimized out. > > > > What about just moving to llvm directly? We already use it for > compute/OpenCL on gallium and as the shader compiler for radeon > hardware and llvmpipe. >
Vincent Lejeune wrote a GLSL IR to LLVM IR state tracker a while back: http://cgit.freedesktop.org/~vlj/mesa/log/?h=glsl-to-llvm-05nov It might be worth looking at if anyone is considering using LLVM. -Tom _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev