On 02/07/2014 07:49 AM, Alex Deucher wrote: > On Fri, Feb 7, 2014 at 12:34 AM, Connor Abbott <cwabbo...@gmail.com> wrote: >> Hi, >> >> So I believe that we can all agree that the tree-based representation >> that GLSL IR currently uses for shaders needs to go. For the benefit >> of those that didn't watch Ian Romanick's talk at FOSDEM, I'll >> reiterate some of the problems with it as of now: >> >> - All the ir_dereference chains blow up the memory usage, and the >> constant pointer chasing in the recursive algorithms needed to handle >> them is not just cache-unfriendly but "cache-mean." >> >> - The ir_hierachical_visitor pattern that we currently use for >> optimization/analysis passes has to examine every piece of IR, even >> the irrelevant stuff, making the above problems even worse. >> >> - Nobody else does it this way, meaning that the existing well-known >> optimizations don't apply as much here, and oftentimes we have to >> write some pretty nasty code in order to make necessary optimizations >> (like tree grafting). >> >> - It turns out that the original advantage of a tree-based IR, to be >> able to automatically generate pattern-matching code for optimizing >> certain code patterns, only really matters for CPU's with weird >> instruction sets with lots of exotic instructions; GPU's tend to be >> pretty regular and consistent in their ISA's, so being able to >> pattern-match with trees doesn't help us much here. >> >> Finally, it seems like a lot of important SSA-based passes assume that >> we have a flat IR, and so moving to SSA won't be nearly as beneficial >> as we would like it to be; we could flatten the IR before doing these >> passes, but that would make the first problem even worse. So we can't >> really take advantage of SSA too much either until we have a flat IR. >> >> The real issue is, how do we let this transition occur gradually, in >> pieces, without breaking existing code? Ian proposed one solution at >> FOSDEM, but here's my idea of another. >> >> So, my idea is that rather than slowly introducing changes across the >> board, we create the IR in its final form in the beginning, write >> passes to flatten and unflatten the IR, and then piece-by-piece >> rewrite the rest of the compiler. We're going to have to rewrite a lot >> of the passes to support SSA in the first place, so why not convert >> them to a flat IR while we're at it? The benefit of this is that it's >> much easier to do asynchronously and in parallel; rather than >> introducing changes to the entire thing at once, several people can >> convert this and that pass, the frontend, the linker, etc. >> independently. It would entail some extra overhead during the >> transition in the form of the flattening and unflattening passes, but >> I think it would be worth it for the immediate benefits (optimizations >> like GVN-GCM and CSE made possible, etc.). >> >> The first part to be converted would be my passes to convert to and >> from SSA, so that the compiler optimization part would look like this: >> >> flatten -> convert to SSA -> (the new hotness) -> out of SSA -> >> unflatten -> (the old stuff) >> >> Then we gradually convert ast_to_hir, various passes, the linker, >> backends, etc. to this form while now actually having the >> infrastructure to implement any advanced compiler optimization >> designed in the last ~15 years or so by more-or-less copying down the >> pseudocode. Hopefully, then, we can reach a point where we can rip out >> the old IR and the converters. >> >> So what would this new IR look like? Well, here's my 2 cents (in the >> form of some abridged class definitions, you should get the point...) >> >> struct ir_calc_source >> { >> mode; /** < SSA or non-SSA */ >> union { >> ir_calculation *def; /** < for SSA sources */ >> unsigned int reg; /** < for non-SSA sources */ >> } src; >> unsigned swizzle : 8; >> }; >> >> struct ir_calc_dest >> { >> mode; /** < SSA or non-SSA */ >> union { >> unsigned int reg; /** < for non-SSA destinations */ >> >> /** >> * For SSA destinations. Types are needed here because >> normally they're part >> * of the register, but SSA doesn't have registers. >> */ >> glsl_type *type; >> } reg_or_type; /* this name is kinda ugly but couldn't think of >> anything better. */ >> }; >> >> /* >> * This is Ian's name for it, personally I would vote for >> s/ir_instruction/ir_node/ and >> * call this ir_instruction >> */ >> >> class ir_calculation >> { >> ir_calc_dest dest; >> ir_expression_operation op; >> unsigned write_mask : 4; >> ir_calc_source srcs[4]; >> }; >> >> class ir_load_var >> { >> ir_calc_dest dest; >> ir_variable *src; >> >> /** >> * For array and record loads, whether we're loading a specific >> member or the whole >> * thing. >> */ >> bool deref_member; >> ir_calc_source array_index; /** < for array loads if >> deref_array_index is true */ >> char *record_index; /** < for structure loads */ >> }; >> >> class ir_store_var >> { >> ir_variable *dest; >> ir_calc_source src; >> bool deref_member; >> ir_calc_source array_index; /** < for array loads */ >> char *record_index; /** < for structure loads */ >> unsigned write_mask : 4; >> }; >> >> So ir_variable still exists, but it will only be used for function >> parameters, shader in/outs and uniforms, and arrays and structures. >> Registers will be much more lightweight, only requiring a table with >> each register's type and perhaps uses and definitions. The flattening >> pass, and later ast_to_hir, will emit loads and stores wherever there >> is an ir_dereference now, but there will be an ir_variable -> register >> pass that converts these to moves that will later be eliminated by >> copy propagation (in SSA form, after converting the registers to SSA >> writes). This is similar to how LLVM works, with everything starting >> out allocated on the stack using alloca (equivalent to ir_variables >> here) and accessed explicitly using loads and stores, but then some of >> these loads/stores are optimized out. >> > > What about just moving to llvm directly? We already use it for > compute/OpenCL on gallium and as the shader compiler for radeon > hardware and llvmpipe.
We're absolutely not interested in anything involving LLVM. We've tried, several times, to make it work, and it has always ended with a table-flip-rage-quit. > Alex > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev