On 01/05, Samuel Iglesias Gonsálvez wrote:
From: "Juan A. Suarez Romero" <jasua...@igalia.com>In IVB and VLV, both regioning parameters and execution sizes are measured as floats. So when we have something like: mov(8) g2<1>DF g3<4,4,1>DF We are not actually moving 8 doubles (our intention), but 4 doubles. We need to duplicate the parameters to cope with this issue.
s/duplicate/double/
--- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 47 ++++++++++++++++++++++---- 1 file changed, 41 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 0710be9..90ee7c1 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -54,13 +54,14 @@ brw_file_from_reg(fs_reg *reg) } static struct brw_reg -brw_reg_from_fs_reg(fs_inst *inst, fs_reg *reg, unsigned gen, bool compressed) +brw_reg_from_fs_reg(const struct brw_compiler *compiler, fs_inst *inst,
I think we should pass a const struct gen_device_info *devinfo instead of a brw_compiler*
+ fs_reg *reg, bool compressed) { struct brw_reg brw_reg; switch (reg->file) { case MRF: - assert((reg->nr & ~BRW_MRF_COMPR4) < BRW_MAX_MRF(gen)); + assert((reg->nr & ~BRW_MRF_COMPR4) < BRW_MAX_MRF(compiler->devinfo->gen)); /* Fallthrough */ case VGRF: if (reg->stride == 0) { @@ -93,6 +94,37 @@ brw_reg_from_fs_reg(fs_inst *inst, fs_reg *reg, unsigned gen, bool compressed) const unsigned width = MIN2(reg_width, phys_width); brw_reg = brw_vecn_reg(width, brw_file_from_reg(reg), reg->nr, 0); brw_reg = stride(brw_reg, width * reg->stride, width, reg->stride); + /* From the Ivy PRM (EU Changes by Processor Generation, page 13):
s/Ivy/Ivy Bridge/
+ * "Each DF (Double Float) operand uses an element size of 4 rather + * than 8 and all regioning parameters are twice what the values
align each additional line of the quotation with the first line: "Each DF (Double Float) operand uses an element size of 4 rather than 8 and all regioning parameters are twice what the values ...
+ * would be based on the true element size: ExecSize, Width, + * HorzStride, and VertStride. Each DF operand uses a pair of + * channels and all masking and swizzing should be adjusted + * appropriately." + * + * From the Ivy PRM (Special Requirements for Handling Double
s/Ivy/Ivy Bridge/
+ * Precision Data Types, page 71): + * "In Align1 mode, all regioning parameters like stride, execution + * size, and width must use the syntax of a pair of packed + * floats. The offsets for these data types must be 64-bit + * aligned. The execution size and regioning parameters are in terms + * of floats."
alignment
+ * + * All these paragraphs summarizes that in Ivy, when handling DF, + * exec_size, width and vertstride must be duplicated. And Horzstride + * should be duplicated when it is greater than 1.
I'd rewrite this a little bit. How about Summarized: when handling DF-typed arguments, ExecSize, VertStride, and Width must be doubled, and HorzStride must be doubled when the region is not scalar.
+ * + * It applies to Valleyview too.
I just looked -- we actually do a surprisingly good job of always using the name BayTrail (BYT) instead of Valleyview (VLV) in i965. Let's s/Valleyview/BayTrail/ This also applies to BayTrail. Reviewed-by: Matt Turner <matts...@gmail.com>
signature.asc
Description: Digital signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev