Re: [14/n] PR85694: Rework overwidening detection

Christophe Lyon Mon, 02 Jul 2018 06:52:21 -0700

On Mon, 2 Jul 2018 at 15:37, Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> Christophe Lyon <christophe.l...@linaro.org> writes:
> > On Fri, 29 Jun 2018 at 13:36, Richard Sandiford
> > <richard.sandif...@arm.com> wrote:
> >>
> >> Richard Sandiford <richard.sandif...@arm.com> writes:
> >> > This patch is the main part of PR85694.  The aim is to recognise at 
> >> > least:
> >> >
> >> >   signed char *a, *b, *c;
> >> >   ...
> >> >   for (int i = 0; i < 2048; i++)
> >> >     c[i] = (a[i] + b[i]) >> 1;
> >> >
> >> > as an over-widening pattern, since the addition and shift can be done
> >> > on shorts rather than ints.  However, it ended up being a lot more
> >> > general than that.
> >> >
> >> > The current over-widening pattern detection is limited to a few simple
> >> > cases: logical ops with immediate second operands, and shifts by a
> >> > constant.  These cases are enough for common pixel-format conversion
> >> > and can be detected in a peephole way.
> >> >
> >> > The loop above requires two generalisations of the current code: support
> >> > for addition as well as logical ops, and support for non-constant second
> >> > operands.  These are harder to detect in the same peephole way, so the
> >> > patch tries to take a more global approach.
> >> >
> >> > The idea is to get information about the minimum operation width
> >> > in two ways:
> >> >
> >> > (1) by using the range information attached to the SSA_NAMEs
> >> >     (effectively a forward walk, since the range info is
> >> >     context-independent).
> >> >
> >> > (2) by back-propagating the number of output bits required by
> >> >     users of the result.
> >> >
> >> > As explained in the comments, there's a balance to be struck between
> >> > narrowing an individual operation and fitting in with the surrounding
> >> > code.  The approach is pretty conservative: if we could narrow an
> >> > operation to N bits without changing its semantics, it's OK to do that 
> >> > if:
> >> >
> >> > - no operations later in the chain require more than N bits; or
> >> >
> >> > - all internally-defined inputs are extended from N bits or fewer,
> >> >   and at least one of them is single-use.
> >> >
> >> > See the comments for the rationale.
> >> >
> >> > I didn't bother adding STMT_VINFO_* wrappers for the new fields
> >> > since the code seemed more readable without.
> >> >
> >> > Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> >>
> >> Here's a version rebased on top of current trunk.  Changes from last time:
> >>
> >> - reintroduce dump_generic_expr_loc, with the obvious change to the
> >>   prototype
> >>
> >> - fix a typo in a comment
> >>
> >> - use vect_element_precision from the new version of 12/n.
> >>
> >> Tested as before.  OK to install?
> >>
> >
> > Hi Richard,
> >
> > This patch introduces regressions on arm-none-linux-gnueabihf:
> >     gcc.dg/vect/vect-over-widen-1-big-array.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "vect_recog_widen_shift_pattern: detected" 2
> >     gcc.dg/vect/vect-over-widen-1-big-array.c scan-tree-dump-times
> > vect "vect_recog_widen_shift_pattern: detected" 2
> >     gcc.dg/vect/vect-over-widen-1.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "vect_recog_widen_shift_pattern: detected" 2
> >     gcc.dg/vect/vect-over-widen-1.c scan-tree-dump-times vect
> > "vect_recog_widen_shift_pattern: detected" 2
> >     gcc.dg/vect/vect-over-widen-4-big-array.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "vect_recog_widen_shift_pattern: detected" 2
> >     gcc.dg/vect/vect-over-widen-4-big-array.c scan-tree-dump-times
> > vect "vect_recog_widen_shift_pattern: detected" 2
> >     gcc.dg/vect/vect-over-widen-4.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "vect_recog_widen_shift_pattern: detected" 2
> >     gcc.dg/vect/vect-over-widen-4.c scan-tree-dump-times vect
> > "vect_recog_widen_shift_pattern: detected" 2
> >     gcc.dg/vect/vect-widen-shift-s16.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "vect_recog_widen_shift_pattern: detected" 8
> >     gcc.dg/vect/vect-widen-shift-s16.c scan-tree-dump-times vect
> > "vect_recog_widen_shift_pattern: detected" 8
> >     gcc.dg/vect/vect-widen-shift-s8.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "vect_recog_widen_shift_pattern: detected" 1
> >     gcc.dg/vect/vect-widen-shift-s8.c scan-tree-dump-times vect
> > "vect_recog_widen_shift_pattern: detected" 1
> >     gcc.dg/vect/vect-widen-shift-u16.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "vect_recog_widen_shift_pattern: detected" 1
> >     gcc.dg/vect/vect-widen-shift-u16.c scan-tree-dump-times vect
> > "vect_recog_widen_shift_pattern: detected" 1
> >     gcc.dg/vect/vect-widen-shift-u8.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "vect_recog_widen_shift_pattern: detected" 2
> >     gcc.dg/vect/vect-widen-shift-u8.c scan-tree-dump-times vect
> > "vect_recog_widen_shift_pattern: detected" 2
>
> Sorry about that, was caused by a stupid typo.  I've applied the
> below as obvious.
>
> (For the record, it was actually 12/n that caused this.  14/n hasn't
> been applied yet.)
>
Sorry about the confusion, I probably messed up in gmail when
searching for the mail containing the patch that caused the
regression.


> Thanks,
> Richard
>
>
> 2018-07-02  Richard Sandiford  <richard.sandif...@arm.com>
>
> gcc/
>         * tree-vect-patterns.c (vect_recog_widen_shift_pattern): Fix typo
>         in dump string.
>
> Index: gcc/tree-vect-patterns.c
> ===================================================================
> --- gcc/tree-vect-patterns.c    2018-07-02 14:30:57.000000000 +0100
> +++ gcc/tree-vect-patterns.c    2018-07-02 14:30:57.383750450 +0100
> @@ -1739,7 +1739,7 @@ vect_recog_widen_shift_pattern (vec<gimp
>  {
>    return vect_recog_widen_op_pattern (stmts, type_out, LSHIFT_EXPR,
>                                       WIDEN_LSHIFT_EXPR, true,
> -                                     "vect_widen_shift_pattern");
> +                                     "vect_recog_widen_shift_pattern");
>  }
>
>  /* Detect a rotate pattern wouldn't be otherwise vectorized:

Re: [14/n] PR85694: Rework overwidening detection

Reply via email to