On Tue, 2016-05-10 at 17:28 -0700, Francisco Jerez wrote: > Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes: > > > From: Iago Toral Quiroga <ito...@igalia.com> > > > > Because the stride is in units for the type, if we copy-propagate from > > a another instruction using a larger type, then we need to make sure > > that the source in that instruction, the one we will be copy-propagating > > from, sources consecutive elements, otherwise, when sourced using a > > a smaller type, the actual elements read would change. > > > > Prevents that we turn this: > > mov(8) vgrf3+2.0:DF, vgrf11<0>:DF > > load_payload(8) vgrf15:UD, ..., vgrf3+2.0<0>:D, vgrf3+3.0<0>:D > > > > Into: > > mov(8) vgrf3+2.0:DF, vgrf11<0>:DF > > load_payload(8) vgrf15:UD, ..., vgrf11<0>:D, vgrf11<0>:D > > > > Sorry but I don't see the problem, the two assembly examples look fully > equivalent to me.
Right, I copied the wrong chunk when writing the commit log. The problem I was trying to address with this was related to your last comment to patch 5, which was still not computing the right subreg_offset in all scenarios. Basically, I had something like: mov(8) vgrf4+0.0:UD vgrf69+0.4<2>:UD and(8) vgrf4+0.0:UD vgrf4+0.0:UD 2147483648u And copy propagation was turning it into: mov(8) vgrf4+0.0:UD u0+0.4<0>:UD and(8) vgrf4+0.0:UD u0<0>:UD 2147483648u So it is basically the subreg_offset calculation that was wrong (which the 2-patch series you attached in reply to patch 5 fix). In retrospect, I am not sure how I came up with this patch because it does not even try to address the actual problem, it simply prevents copy-propagation from acting on the first instruction, which then prevents the bug to show up in the second one so it simply fixed the problem by accident. I don't know what I was thinking :-/ With your two patches the problem is fixed so we can just drop this patch. > > In the original instructions, vgrf3+2.0<0>:D reads a replicated 64-bit > > value, while the result after copy propagation only reads the first > > 32-bit of the value. > > --- > > src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > > b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > > index 9fc06cb..f98ab41 100644 > > --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > > @@ -400,6 +400,15 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, > > acp_entry *entry) > > if (type_sz(entry->dst.type) < type_sz(inst->src[arg].type)) > > return false; > > > > + /* Bail if the instruction type is larger than the execution type of the > > + * copy and the stride of the source we would be copy propagating from > > + * has a stride other than 1. Otherwise, since the stride is in units of > > + * the type, we would be changing the region effectively sourced. > > + */ > > + if (type_sz(entry->dst.type) > type_sz(inst->src[arg].type) && > > + entry->src.stride != 1) > > + return false; > > NAK, there is a more accurate condition just a few lines below making > sure that the strides can be composed correctly when the original > entry->src.stride is not one. Some special cases of > 'type_sz(entry->dst.type) > type_sz(inst->src[arg].type) && > entry->src.stride != 1' are handled by copy propagation, and the cases > that are not should already be caught by the check immediately below. > > > + > > /* Bail if the result of composing both strides cannot be expressed > > * as another stride. This avoids, for example, trying to transform > > * this: > > -- > > 2.5.0 > > > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev