On 01/05, Samuel Iglesias Gonsálvez wrote:
From: "Juan A. Suarez Romero" <jasua...@igalia.com>

In IVB and VLV, both regioning parameters and execution sizes are measured as
floats.

So when we have something like:

mov(8) g2<1>DF g3<4,4,1>DF

We are not actually moving 8 doubles (our intention), but 4 doubles.

We need to duplicate the parameters to cope with this issue.

s/duplicate/double/

---
src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 47 ++++++++++++++++++++++----
1 file changed, 41 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 0710be9..90ee7c1 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -54,13 +54,14 @@ brw_file_from_reg(fs_reg *reg)
}

static struct brw_reg
-brw_reg_from_fs_reg(fs_inst *inst, fs_reg *reg, unsigned gen, bool compressed)
+brw_reg_from_fs_reg(const struct brw_compiler *compiler, fs_inst *inst,

I think we should pass a const struct gen_device_info *devinfo instead
of a brw_compiler*

+                    fs_reg *reg, bool compressed)
{
   struct brw_reg brw_reg;

   switch (reg->file) {
   case MRF:
-      assert((reg->nr & ~BRW_MRF_COMPR4) < BRW_MAX_MRF(gen));
+      assert((reg->nr & ~BRW_MRF_COMPR4) < 
BRW_MAX_MRF(compiler->devinfo->gen));
      /* Fallthrough */
   case VGRF:
      if (reg->stride == 0) {
@@ -93,6 +94,37 @@ brw_reg_from_fs_reg(fs_inst *inst, fs_reg *reg, unsigned 
gen, bool compressed)
         const unsigned width = MIN2(reg_width, phys_width);
         brw_reg = brw_vecn_reg(width, brw_file_from_reg(reg), reg->nr, 0);
         brw_reg = stride(brw_reg, width * reg->stride, width, reg->stride);
+         /* From the Ivy PRM (EU Changes by Processor Generation, page 13):

s/Ivy/Ivy Bridge/

+          *  "Each DF (Double Float) operand uses an element size of 4 rather
+          *  than 8 and all regioning parameters are twice what the values

align each additional line of the quotation with the first line:

  "Each DF (Double Float) operand uses an element size of 4 rather
   than 8 and all regioning parameters are twice what the values
   ...

+          *  would be based on the true element size: ExecSize, Width,
+          *  HorzStride, and VertStride. Each DF operand uses a pair of
+          *  channels and all masking and swizzing should be adjusted
+          *  appropriately."
+          *
+          * From the Ivy PRM (Special Requirements for Handling Double

s/Ivy/Ivy Bridge/

+          * Precision Data Types, page 71):
+          *  "In Align1 mode, all regioning parameters like stride, execution
+          *  size, and width must use the syntax of a pair of packed
+          *  floats. The offsets for these data types must be 64-bit
+          *  aligned. The execution size and regioning parameters are in terms
+          *  of floats."

alignment

+          *
+          * All these paragraphs summarizes that in Ivy, when handling DF,
+          * exec_size, width and vertstride must be duplicated. And Horzstride
+          * should be duplicated when it is greater than 1.

I'd rewrite this a little bit. How about

        Summarized: when handling DF-typed arguments, ExecSize,
        VertStride, and Width must be doubled, and HorzStride must be
        doubled when the region is not scalar.

+          *
+          * It applies to Valleyview too.

I just looked -- we actually do a surprisingly good job of always using
the name BayTrail (BYT) instead of Valleyview (VLV) in i965. Let's
s/Valleyview/BayTrail/

        This also applies to BayTrail.

Reviewed-by: Matt Turner <matts...@gmail.com>

Attachment: signature.asc
Description: Digital signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to