On 03/10/2014 07:59 AM, Juha-Pekka Heikkila wrote:
>

I might add this to the commit message:

This allows us to emit ADD/MUL/MAC instead of MUL/ADD/MUL/ADD, saving
one instruction and two temporary registers.

> Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikk...@gmail.com>
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 26 
> +++++++-------------------
>  1 file changed, 7 insertions(+), 19 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> index dc58457..4e4ab6e 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> @@ -1160,26 +1160,14 @@ vec4_visitor::emit_lrp(const dst_reg &dst,
>        emit(LRP(dst,
>                 fix_3src_operand(a), fix_3src_operand(y), 
> fix_3src_operand(x)));
>     } else {
> -      /* Earlier generations don't support three source operations, so we
> -       * need to emit x*(1-a) + y*a.
> -       *
> -       * A better way to do this would be:
> -       *    ADD one_minus_a, negate(a), 1.0f
> -       *    MUL null, y, a
> -       *    MAC dst, x, one_minus_a
> -       * but we would need to support MAC and implicit accumulator.
> -       */
> -      dst_reg y_times_a           = dst_reg(this, glsl_type::vec4_type);
> -      dst_reg one_minus_a         = dst_reg(this, glsl_type::vec4_type);
> -      dst_reg x_times_one_minus_a = dst_reg(this, glsl_type::vec4_type);
> -      y_times_a.writemask           = dst.writemask;
> -      one_minus_a.writemask         = dst.writemask;
> -      x_times_one_minus_a.writemask = dst.writemask;
> -
> -      emit(MUL(y_times_a, y, a));
> +      dst_reg one_minus_a   = dst_reg(this, glsl_type::vec4_type);
> +      one_minus_a.writemask = dst.writemask;
> +
> +      struct brw_reg acc = retype(brw_acc_reg(), dst.type);
> +
>        emit(ADD(one_minus_a, negate(a), src_reg(1.0f)));
> -      emit(MUL(x_times_one_minus_a, x, src_reg(one_minus_a)));
> -      emit(ADD(dst, src_reg(x_times_one_minus_a), src_reg(y_times_a)));
> +      emit(MUL(acc, y, a));
> +      emit(MAC(dst, x, src_reg(one_minus_a)));
>     }
>  }

I think the intention was to do:

vec4_instruction *mul = emit(MUL(dst_null_f(), y, a));
mul->writes_accumulator = true;

but (and I feel really stupid now), I think your code using the
accumulator as an explicit destination will work just fine, too.

I'm not sure if there's any advantage to doing it one way or the other.
 Matt, any thoughts?

Either way, this patch is:
Reviewed-by: Kenneth Graunke <kenn...@whitecape.org>

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to