On Sun, Jan 8, 2017 at 10:53 PM, Matt Turner <matts...@gmail.com> wrote:
> On 01/05, Samuel Iglesias Gonsálvez wrote:
>>
>> From: "Juan A. Suarez Romero" <jasua...@igalia.com>
>>
>> When dealing with DF uniforms with just 1 component, we set stride 0 to
>> use the value along the operation. However, when duplicating the
>> regioning parameters in IVB/VLV, we are violating the regioning
>> restrictions.
>>
>> So instead of using the value with stride 0, we just duplicate it in a
>> register, and then use the register instead, avoiding a DF with stride 0.
>> ---
>> src/mesa/drivers/dri/i965/brw_fs.cpp | 63
>> ++++++++++++++++++++++++++++++++++++
>> src/mesa/drivers/dri/i965/brw_fs.h   |  1 +
>> 2 files changed, 64 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index eb3b4aa..78f2124 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -2168,6 +2168,62 @@ fs_visitor::lower_constant_loads()
>>    invalidate_live_intervals();
>> }
>>
>> +/**
>> + * When dealing with double-precision floats (DF) in IVB/VLV, we need to
>> + * duplicate the regioning parameters. This means that for a DF scalar
>> + * (regioning <0,1,0>) we will end up using regioning <0,2,0>. But
>> according
>> + * to General Restrictions on Regioning Parameters (Ivy PRM, Vol. 4 Part
>> 3,
>> + * page 69), if VertStride = HorzStride = 0, Width must be 1 regardless
>> of the
>> + * value of ExecSize. So we would be violating the restriction. To
>> overcome
>> + * it, this lowering step duplicates the scalar in a couple of registers,
>> + * reading it as two floats to avoid the restriction.
>
>
> Huh, I would have thought that a <0,1,0>DF region would have done what
> we wanted, without the need to double any of the region parameters.
>
> I haven't tested yet, so I'll play with it tomorrow and see if it blows
> up.

Indeed, it looks like gX<0,2,1>DF works for scalar sources.

The following patch seems to work for me in place of 04/22.

I would expect if we had an instruction like

add(16)  gX<1>DF  gY<8,8,1>DF  gZ<0,2,1>DF

that gX would write 2 registers and gY would read two registers, but
gZ would read only one register -- and that would violate the
restriction that an instruction that writes two registers must also
read two registers.

But it seems that we never generate a DF instruction with exec_size=16
because of

4554       /* Note that in IVB/VLV for instructions that handles DF,
we will duplicate
4555        * the exec_size. So take this value for calculus purposes.
4556        */
4557       unsigned exec_size = inst->exec_size;
4558       if (devinfo->gen == 7 && !devinfo->is_haswell &&
inst->exec_data_size() == 8)
4559          exec_size *= 2;

... which I think Curro NAK'd elsewhere.

I think what we want is to use 0,2,1 regions for scalar sources, and
split exec_size=16 (i.e., 8 DFs) instructions when they use scalar
sources in order to avoid violating the restriction. Otherwise, we
should emit exec_size=16 DF instructions where possible.
From 67bab74148e110171b8f3a999dc1d79616013108 Mon Sep 17 00:00:00 2001
From: Matt Turner <matts...@gmail.com>
Date: Tue, 10 Jan 2017 19:33:22 -0800
Subject: [PATCH] i965: Use <0,2,1> region for scalar DF sources on IVB/BYT.

On HSW+, scalar DF sources can be accessed using the normal <0,1,0>
region, but on IVB and BYT DF regions must be programmed in terms of
floats. A <0,2,1> region accomplishes this.
---
 src/mesa/drivers/dri/i965/brw_eu_emit.c | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index 2843385..5f81b7a 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -495,9 +495,16 @@ brw_set_src0(struct brw_codegen *p, brw_inst *inst, struct brw_reg reg)
             brw_inst_set_src0_width(devinfo, inst, BRW_WIDTH_1);
             brw_inst_set_src0_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_0);
 	 } else {
-            brw_inst_set_src0_hstride(devinfo, inst, reg.hstride);
-            brw_inst_set_src0_width(devinfo, inst, reg.width);
-            brw_inst_set_src0_vstride(devinfo, inst, reg.vstride);
+            if (devinfo->gen == 7 && !devinfo->is_haswell &&
+                reg.type == BRW_REGISTER_TYPE_DF && has_scalar_region(reg)) {
+               brw_inst_set_src0_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_0);
+               brw_inst_set_src0_width(devinfo, inst, BRW_WIDTH_2);
+               brw_inst_set_src0_hstride(devinfo, inst, BRW_HORIZONTAL_STRIDE_1);
+            } else {
+               brw_inst_set_src0_vstride(devinfo, inst, reg.vstride);
+               brw_inst_set_src0_width(devinfo, inst, reg.width);
+               brw_inst_set_src0_hstride(devinfo, inst, reg.hstride);
+            }
 	 }
       } else {
          brw_inst_set_src0_da16_swiz_x(devinfo, inst,
@@ -577,9 +584,16 @@ brw_set_src1(struct brw_codegen *p, brw_inst *inst, struct brw_reg reg)
             brw_inst_set_src1_width(devinfo, inst, BRW_WIDTH_1);
             brw_inst_set_src1_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_0);
 	 } else {
-            brw_inst_set_src1_hstride(devinfo, inst, reg.hstride);
-            brw_inst_set_src1_width(devinfo, inst, reg.width);
-            brw_inst_set_src1_vstride(devinfo, inst, reg.vstride);
+            if (devinfo->gen == 7 && !devinfo->is_haswell &&
+                reg.type == BRW_REGISTER_TYPE_DF && has_scalar_region(reg)) {
+               brw_inst_set_src1_vstride(devinfo, inst, BRW_VERTICAL_STRIDE_0);
+               brw_inst_set_src1_width(devinfo, inst, BRW_WIDTH_2);
+               brw_inst_set_src1_hstride(devinfo, inst, BRW_HORIZONTAL_STRIDE_1);
+            } else {
+               brw_inst_set_src1_vstride(devinfo, inst, reg.vstride);
+               brw_inst_set_src1_width(devinfo, inst, reg.width);
+               brw_inst_set_src1_hstride(devinfo, inst, reg.hstride);
+            }
 	 }
       } else {
          brw_inst_set_src1_da16_swiz_x(devinfo, inst,
-- 
2.7.3

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to