On 19 September 2012 13:27, Kenneth Graunke <kenn...@whitecape.org> wrote:
> Data port reads are absurdly slow on Ivybridge due to cache issues. > > The LD message ignores the sampler unit index and SAMPLER_STATE pointer, > instead relying on hard-wired default state. Thus, there's no need to > worry about running out of sampler units or providing SAMPLER_STATE; > this small patch should be all that's required. > > NOTE: This is a candidate for all release branches. > Given that this affects only performance and not correctness, I'm having trouble convincing myself that this patch should be a candidate for release branches. Don't we usually try to restrict release cherry-picks to things like rendering issues and avoiding GPU hangs? > > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> > --- > src/mesa/drivers/dri/i965/brw_fs.h | 3 +++ > src/mesa/drivers/dri/i965/brw_fs_emit.cpp | 36 > ++++++++++++++++++++++++++++++- > 2 files changed, 38 insertions(+), 1 deletion(-) > > I did this a long time ago for VS pull constant loading, which resulted in > a 2-5x speedup for certain benchmarks. Apparently at the time I never got > FS pull constant loading working, and didn't have a benchmark that needed > it, so I never finished and pushed it. > > Now I have a game that needs it. No concrete data as I haven't figured out > how to get consistent FPS numbers out of it. > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.h > b/src/mesa/drivers/dri/i965/brw_fs.h > index e69de31..b5f2152 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.h > +++ b/src/mesa/drivers/dri/i965/brw_fs.h > @@ -295,6 +295,9 @@ public: > void generate_pull_constant_load(fs_inst *inst, struct brw_reg dst, > struct brw_reg index, > struct brw_reg offset); > + void gen7_generate_pull_constant_load(fs_inst *inst, struct brw_reg > dst, > + struct brw_reg index, > + struct brw_reg offset); > void generate_mov_dispatch_to_flags(); > > void emit_dummy_fs(); > diff --git a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp > b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp > index 5900c0e..4059660 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_emit.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_emit.cpp > @@ -585,6 +585,37 @@ fs_visitor::generate_unspill(fs_inst *inst, struct > brw_reg dst) > } > > void > +fs_visitor::gen7_generate_pull_constant_load(fs_inst *inst, struct > brw_reg dst, > + struct brw_reg index, > + struct brw_reg offset) > +{ > + assert(intel->gen == 7); > + assert(index.file == BRW_IMMEDIATE_VALUE && > + index.type == BRW_REGISTER_TYPE_UD); > + assert(offset.file == BRW_IMMEDIATE_VALUE && > + offset.type == BRW_REGISTER_TYPE_UD); > + uint32_t surf_index = index.dw1.ud; > + uint32_t read_offset = offset.dw1.ud; > + > + /* offset is an IMM; SEND needs to be from a GRF. */ > + offset = retype(brw_vec8_grf(127, 0), BRW_REGISTER_TYPE_UD); > + brw_MOV(p, offset, brw_imm_ud(read_offset / 16)); > + > + brw_instruction *insn = brw_next_insn(p, BRW_OPCODE_SEND); > + brw_set_dest(p, insn, dst); > + brw_set_src0(p, insn, offset); > + brw_set_sampler_message(p, insn, > + surf_index, > + 0, /* LD message ignores sampler unit */ > + GEN5_SAMPLER_MESSAGE_SAMPLE_LD, > + 1, /* rlen */ > + 1, /* mlen */ > + false, /* no header */ > + BRW_SAMPLER_SIMD_MODE_SIMD4X2, > + 0); > +} > + > +void > fs_visitor::generate_pull_constant_load(fs_inst *inst, struct brw_reg dst, > struct brw_reg index, > struct brw_reg offset) > @@ -980,7 +1011,10 @@ fs_visitor::generate_code() > break; > > case FS_OPCODE_PULL_CONSTANT_LOAD: > - generate_pull_constant_load(inst, dst, src[0], src[1]); > + if (intel->gen == 7) > + gen7_generate_pull_constant_load(inst, dst, src[0], src[1]); > + else > + generate_pull_constant_load(inst, dst, src[0], src[1]); > break; > > case FS_OPCODE_FB_WRITE: > -- > 1.7.11.4 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev