13 Regression] wrong code with -O -ftree-vectorize -fvect-cost-model=unlimited on aarch64

rguenth at gcc dot gnu.org via Gcc-bugs Fri, 14 Apr 2023 00:28:44 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109502


--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #2)
> (In reply to Andrew Pinski from comment #1)
> > SLP transforms:
> > 
> >   g.0_1 = g;
> >   _2 = g.0_1 == 0;
> >   a_7 = (unsigned int) _2;
> >   _3 = a_7 % 6;
> >   _4 = _3 == 0;
> >   _5 = (unsigned int) _4;
> >   a_8 = _5 + a_7;
> > 
> > To:
> > 
> >   g.0_1 = g;
> >   _2 = g.0_1 == 0;
> >   a_7 = (unsigned int) _2;
> >   _3 = a_7 % 6;
> >   _15 = {_3, g.0_1};
> >   mask__4.4_16 = { 0, 0 } == _15;
> >   vect__5.5_19 = VIEW_CONVERT_EXPR<vector(2) unsigned int>(mask__4.4_16);
> >   _17 = BIT_FIELD_REF <mask__4.4_16, 32, 0>;
> >   _18 = (bool) _17;
> >   _4 = _3 == 0;
> >   _5 = (unsigned int) _18;
> >   _20 = .REDUC_PLUS (vect__5.5_19);
> >   a_8 = _20;
> > 
> 
> If anything there is a missing, a negative after the
> reduc_plus (or before) when it translates the bools comparisons into vector
> comparisons.

Indeed.  Usually this is the failure of bool pattern detection.

unsigned foo (unsigned *p)
{
  unsigned tem1 = p[0] == 0;
  unsigned tem2 = p[1] == 0;
  unsigned tem3 = p[2] == 0;
  unsigned tem4 = p[3] == 0;
  return tem1 + tem2 + tem3 + tem4;
}

doesn't reproduce it - we have put in defences "after the fact" to work
around this for some cases:

t.c:7:29: note:   ==> examining statement: tem4_16 = (unsigned int) _8;
t.c:7:29: note:   vect_is_simple_use: operand _7 == 0, type of def: internal
t.c:7:29: missed:   type conversion to/from bit-precision unsupported.
t.c:7:29: note:   vect_is_simple_use: operand _7 == 0, type of def: internal
t.c:7:29: missed:   mixed mask and nonmask vector types

[Bug tree-optimization/109502] [12/13 Regression] wrong code with -O -ftree-vectorize -fvect-cost-model=unlimited on aarch64

Reply via email to