I had a pretty similar patch on the top of my pow-optimization branch. I also expand x**3 and x**4. I had hoped that would enable some cases to expand then merge to MADs. It should also be faster on older GENs where POW perf sucks. I didn't send it out because I wanted to add a similar optimization in the back end that would turn x*x*x*x back into x**4 on GPUs where the POW would be faster.
I also didn't have anything in shader-db that benefitted from x**2 or x**3. It seems like there were a couple that would be modified by a x**5 flattening, but I think that would universally be slower.... On 03/10/2014 03:54 PM, Matt Turner wrote: > Cuts two instructions out of SynMark's Gl32VSInstancing benchmark. > --- > src/glsl/opt_algebraic.cpp | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp > index 5c49a78..8494bd9 100644 > --- a/src/glsl/opt_algebraic.cpp > +++ b/src/glsl/opt_algebraic.cpp > @@ -528,6 +528,14 @@ ir_algebraic_visitor::handle_expression(ir_expression > *ir) > if (is_vec_two(op_const[0])) > return expr(ir_unop_exp2, ir->operands[1]); > > + if (is_vec_two(op_const[1])) { > + ir_variable *x = new(ir) ir_variable(ir->operands[1]->type, "x", > + ir_var_temporary); > + base_ir->insert_before(x); > + base_ir->insert_before(assign(x, ir->operands[0])); > + return mul(x, x); > + } > + > break; > > case ir_unop_rcp: > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev