fs: consider execsize can be duplicated in lower_simd_with

Juan A. Suarez Romero Tue, 10 Jan 2017 04:43:58 -0800

On Mon, 2017-01-09 at 15:41 -0800, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> 
> > From: "Juan A. Suarez Romero" <jasua...@igalia.com>
> > 
> > In IVB/VLV, for instructions dealing with DF, execsize will be
> > duplicated in the final code.
> > 
> > So take this in account when checking if instructions should be
> > split.
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs.cpp | 13 ++++++++++++-
> >  1 file changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > index 78f2124..cfce364 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > @@ -4551,8 +4551,15 @@ static unsigned
> >  get_fpu_lowered_simd_width(const struct gen_device_info *devinfo,
> >                             const fs_inst *inst)
> >  {
> > +   /* Note that in IVB/VLV for instructions that handles DF, we
> > will duplicate
> > +    * the exec_size. So take this value for calculus purposes.
> > +    */
> > +   unsigned exec_size = inst->exec_size;
> > +   if (devinfo->gen == 7 && !devinfo->is_haswell && inst-
> > >exec_data_size() == 8)
> > +      exec_size *= 2;
> > +
> 
> NAK.  This won't have any useful effect because the max_width value
> doesn't actually have any influence in the regioning restriction
> calculations below, it's only used to accumulate the intermediate
> execution size limits derived from each hardware restriction.  I
> don't
> think you need to change any of the restrictions below for the result
> to
> be correct in presence of exec size doubling.




Yes, you're right.

Actually the motivation for the patch was to avoid ending up with a
final exec_size > 8  (after doubling it in the generator) when we are
generating a SIMD8 shader (or exec_size > 16 in SIMD16 shader).

So if we have a exec_size == 8, and we are generating SIMD8 code, then
we want to split in two instructions of exec_size==4, because at the
end we will double the exec_size to 8. Same reasoning with SIMD16 code.


This could be done as:

@@ -5119,7 +5119,11 @@ fs_visitor::lower_simd_width()
    bool progress = false;
 
    foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
-      const unsigned lower_width = get_lowered_simd_width(devinfo,
inst);
+      unsigned lower_width = get_lowered_simd_width(devinfo, inst);
+
+      if (devinfo->gen == 7 && !devinfo->is_haswell &&
+          (get_exec_type_size(inst) == 8 || type_sz(inst->dst.type) ==
8))
+         lower_width = MIN2(lower_width, dispatch_width / 2);


Now, this is all in theory. So far, no test has exposed this
situation with this change of top of our branch, and the lower_width
returned by get_lowered_simd_width() is always less than 8. But maybe
this could happen in future, for instance, if we emit an instruction
that writes more than 4 DF and has force_writemask_all == TRUE.

> 
> >     /* Maximum execution size representable in the instruction controls. */
> > -   unsigned max_width = MIN2(32, inst->exec_size);
> > +   unsigned max_width = MIN2(32, exec_size);
> >  
> >     /* According to the PRMs:
> >      *  "A. In Direct Addressing mode, a source cannot span more than 2
> > @@ -4656,6 +4663,10 @@ get_fpu_lowered_simd_width(const struct 
> > gen_device_info *devinfo,
> >           max_width = MIN2(max_width, channels_per_grf);
> >     }
> >  
> > +   /* If we have duplicated exec_size, then readjust max_width if 
> > required. */
> > +   if (exec_size != inst->exec_size && max_width == exec_size)
> > +      max_width = inst->exec_size;
> > +
> >     /* Only power-of-two execution sizes are representable in the 
> > instruction
> >      * control fields.
> >      */
> > -- 
> > 2.9.3
> > 
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 05/22] i965/fs: consider execsize can be duplicated in lower_simd_with

Reply via email to