On 2016-05-03 05:21:54, Samuel Iglesias Gonsálvez wrote: > From: Iago Toral Quiroga <ito...@igalia.com> > > The current code ignores the suboffet in the instruction's source > and just uses the one from the constant. This is not correct > when the instruction's source is accessing the constant with a > different type and using the suboffset to select a specific > chunk of the constant. We generate this kind of code in fp64 > when we want to select only the high 32-bit of a particular > double constant. > > Instead, we should add any existing suboffset in the > instruction's source (modulo the size of the entry's type) > to the suboffset in the constant so we can preserve the orinal > semantics. > > Prevents that we turn this: > > mov(8) vgrf5:DF, u2<0>:DF > mov(8) vgrf7:UD, vgrf5+0.4<2>:UD > > Into: > > mov(8) vgrf7:UD, u2<0>:UD > > And instead, with this patch, we produce: > > mov(8) vgrf7:UD, u2+0.4<0>:UD > --- > .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 23 > ++++++++++++++++++++-- > 1 file changed, 21 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > index aa4c9c9..5fae10f 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > @@ -445,8 +445,27 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, > acp_entry *entry) > case BAD_FILE: > case ARF: > case FIXED_GRF: > - inst->src[arg].reg_offset = entry->src.reg_offset; > - inst->src[arg].subreg_offset = entry->src.subreg_offset; > + { > + inst->src[arg].reg_offset = entry->src.reg_offset; > + inst->src[arg].subreg_offset = entry->src.subreg_offset; > + > + /* If we copy propagate from a larger type we have to be aware that > + * the instruction might be using subreg_offset to select a > particular > + * chunk of the data in the entry. For example: > + * > + * mov(8) vgrf5:DF, u2<0>:DF > + * mov(8) vgrf7:UD, vgrf5+0.4<2>:UD > + * > + * vgrf5+0.4<2>:UD is actually reading the high 32-bit of u2.0, so > if > + * we want to copy propagate here we have to do it from u2+0.4. > + */ > + int type_sz_src = type_sz(inst->src[arg].type); > + int type_sz_entry = type_sz(entry->src.type); > + if (type_sz_entry > type_sz_src) { > + inst->src[arg].subreg_offset += > + inst->src[arg].subreg_offset % type_sz_entry;
Seeing 'inst->src[arg].subreg_offset' on both sides of this += seems strange. Is this correct? -Jordan > + } > + } > break; > case ATTR: > case VGRF: > -- > 2.5.0 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev