On Wed, Jun 3, 2015 at 5:09 AM, Neil Roberts <n...@linux.intel.com> wrote: > Looks good to me. Thanks for fixing this. I guess I still have more to > learn about the ISA. > > However, should we not also fix the vec4 version? With that, > > Reviewed-by: Neil Roberts <n...@linux.intel.com>
Yes, I'll do that too. I started to... and then noticed that the code coming out of the vec4 backend was entirely broken -- the send instruction had a null src0 -- and forgot. After some investigation, I realized that's what your other patch is fixing! (i965/vec4: Fix the source register for indexed samplers). > > If we wanted to play safe and avoid the MUL, we could change it to this > and still avoid having a temporary: > > /* addr = (((sampler << 8) | sampler) + > * base_binding_table_index) & 0xfff > */ > brw_SHL(p, addr, sampler_reg, brw_imm_ud(8u)); > brw_OR(p, addr, addr, sampler_reg); > > I just thought the MUL trick was more fun. I'm not sure which would be > faster. The SHL+OR might take up fewer clock cycles but it also > introduces a bubble because it is writing to a register and then > immediately reading from it in the next instruction. > > On the other hand it's probably really not worth worrying about > optimising this bit of code because literally nothing is using it in > shader-db. We can just go with whatever works. Nah, the multiply's fine. Most instructions have the same latency (time before results are ready) and issue time (time before the EU can issue another independent instruction). MUL/SHL/OR all are equal, from what I've measured. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev