On Wed, Jul 16, 2014 at 4:14 PM, Thomas Helland <thomashellan...@gmail.com> wrote: > 2014-07-13 20:13 GMT+02:00 Matt Turner <matts...@gmail.com>: >> >> On Sun, Jul 13, 2014 at 10:50 AM, Thomas Helland >> <thomashellan...@gmail.com> wrote: >> > I've considered writing an algebraic optimization to convert >> > this into an ir_binop_pow. If my understanding is correct the backend >> > will then implement this in a similar fashion as above if it does not >> > have a native pow() instruction. >> > >> > If, on the other hand, we have a pow() instruction, my guess is >> > we'd see reduced instruction-counts. >> > >> > Is my understanding correct? Is this something that's worth doing? >> >> Yes and yes :) >> >> It's something I've thought about doing for a while. The only hang-up >> is that we don't get nice expression trees to match in opt_algebraic. >> Ideally, we'd get an ir_instruction with an rvalue that looked like >> >> (assign (xyz) (var_ref r3) (expression vec3 log2 (expression vec3 * >> (expression vec3 exp2 (swiz xyz (var_ref r3))) (constant vec3 >> (2.200000 2.200000 2.200000))))) >> >> and then the bit of code in opt_algebraic is simple. Unfortunately, r3 >> is likely a vec4 and is used repeatedly throughout the shader for many >> unrelated things. If we were able to split up these variables (i.e., >> recognize that the use of r3 for log2/mul/exp2 is a distinct live >> range from the other uses of r3, and give it a new variable name) then >> tree grafting would be able to give us the expression tree that we >> want. >> > > So we would probably be helped with a UD-chain, and a pass to > make new variables for each of the new definitions? > As far as I've managed to aclimate to the code-base we > do not have such a feature yet in the glsl-compiler?
Right. UD chains would probably help a lot in solving this problem. >> That would let a lot of existing optimization passes perform better as well. >> >> Ken and I worked on this kind of pass in the i965 backend [0]. It >> looked for full register writes outside of control flow, assigned the >> result to a new register, and rewrote future uses of the old with the >> new register. Something like that at the GLSL IR level would do the >> trick. One problem to solve is how to handle partial writes of >> variables, since in the case you brought up the shader only uses 3 >> components of a vec4, but they're still a distinct live range. >> > > I guess we would need to keep track of the uses and defs for > each component in the vector, some kind of fancy UD-chain > that works component-wise, and also globally on the vector. > > I accidentally stumbled across some work in Eric's git-repo that > looks pretty useful as a basis for how to go about this. [1] > It seems to implement live-variable analysis that are both > control-flow and swizzle-aware, and works component-wise. > I have only given it a short glimpse, but seems promising. I hadn't considered using that code, but yeah, that would probably be really helpful. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev