From: Iago Toral Quiroga <ito...@igalia.com> Because the stride is in units for the type, if we copy-propagate from a another instruction using a larger type, then we need to make sure that the source in that instruction, the one we will be copy-propagating from, sources consecutive elements, otherwise, when sourced using a a smaller type, the actual elements read would change.
Prevents that we turn this: mov(8) vgrf3+2.0:DF, vgrf11<0>:DF load_payload(8) vgrf15:UD, ..., vgrf3+2.0<0>:D, vgrf3+3.0<0>:D Into: mov(8) vgrf3+2.0:DF, vgrf11<0>:DF load_payload(8) vgrf15:UD, ..., vgrf11<0>:D, vgrf11<0>:D In the original instructions, vgrf3+2.0<0>:D reads a replicated 64-bit value, while the result after copy propagation only reads the first 32-bit of the value. --- src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp index 9fc06cb..f98ab41 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp @@ -400,6 +400,15 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) if (type_sz(entry->dst.type) < type_sz(inst->src[arg].type)) return false; + /* Bail if the instruction type is larger than the execution type of the + * copy and the stride of the source we would be copy propagating from + * has a stride other than 1. Otherwise, since the stride is in units of + * the type, we would be changing the region effectively sourced. + */ + if (type_sz(entry->dst.type) > type_sz(inst->src[arg].type) && + entry->src.stride != 1) + return false; + /* Bail if the result of composing both strides cannot be expressed * as another stride. This avoids, for example, trying to transform * this: -- 2.5.0 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev