On Tue, Nov 11, 2014 at 9:48 AM, Kenneth Graunke <kenn...@whitecape.org> wrote: > On Friday, October 24, 2014 03:09:45 PM Matt Turner wrote: >> Three source instructions cannot directly source a packed vec4 (<0,4,1> >> regioning) like vec4 uniforms, so we emit a MOV that expands the vec4 to >> both halves of a register. >> >> If these uniform values are used by multiple three-source instructions, >> we'll emit multiple expansion moves, which we cannot combine in CSE >> (because CSE emits moves itself). >> >> So emit a virtual instruction that we can CSE. >> >> Sometimes we demote a uniform to to a pull constant after emitting an >> expansion move for it. In that case, recognize in opt_algebraic that if >> the .file of the new instruction is GRF then it's just a real move that >> we can copy propagate and such. >> >> total instructions in shared programs: 5509805 -> 5499940 (-0.18%) >> instructions in affected programs: 348923 -> 339058 (-2.83%) >> --- >> src/mesa/drivers/dri/i965/brw_defines.h | 1 + >> src/mesa/drivers/dri/i965/brw_shader.cpp | 2 ++ >> src/mesa/drivers/dri/i965/brw_vec4.cpp | 7 +++++++ >> src/mesa/drivers/dri/i965/brw_vec4_cse.cpp | 1 + >> src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 1 + >> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 2 +- >> 6 files changed, 13 insertions(+), 1 deletion(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > b/src/mesa/drivers/dri/i965/brw_defines.h >> index a68d607..a6807ae 100644 >> --- a/src/mesa/drivers/dri/i965/brw_defines.h >> +++ b/src/mesa/drivers/dri/i965/brw_defines.h >> @@ -904,6 +904,7 @@ enum opcode { >> SHADER_OPCODE_GEN7_SCRATCH_READ, >> >> VEC4_OPCODE_PACK_BYTES, >> + VEC4_OPCODE_UNPACK_UNIFORM, >> >> FS_OPCODE_DDX, >> FS_OPCODE_DDY, >> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp > b/src/mesa/drivers/dri/i965/brw_shader.cpp >> index 87f2681..ea73dd7 100644 >> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp >> @@ -457,6 +457,8 @@ brw_instruction_name(enum opcode op) >> >> case VEC4_OPCODE_PACK_BYTES: >> return "pack_bytes"; >> + case VEC$_OPCODE_UNPACK_UNIFORM: > > MONEY! ^^^^^^^ > > I'm really not crazy about this patch...it's just another MOV operation, which > can be CSE'd, but isn't considered everywhere else we handle MOV. Such as > copy propagation, for example. > > I suppose this is OK, because you're only using it for "the following > instruction can't handle <whatever>" MOVs, where you're not likely to be able > to do things like copy propagation anyway.
That's exactly right. The only place it's emitted is from is fix_3src_operand, which we know can't take these arguments directly so copy propagation can't help us. opt_algebraic changes the opcode to MOV if the source isn't a uniform, like for an immediate. From there we can constant propagate it or whatever. > Speaking of, you could probably use it for other workarounds too: > - resolve_ud_negate > - fix_math_operand > - emit_math1_gen6 > - emit_math2_gen6 > > But I don't know if that would be helpful. > > Although I'm not a fan, I don't have anything better to suggest, and your > statistics clearly show a benefit. It's probably fine. I really don't have any idea why. Short of adding a bunch of code to CSE to rewrite operands rather than emitting another MOV (I don't know how feasible that is, given the complexity of register coalescing) I really don't know of a more elegant solution. > Reviewed-by: Kenneth Graunke <kenn...@whitecape.org> Thanks! _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev