https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96481

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
so one interesting speciality of this testcase is that the ifs switch between
two scalar values and overall there's no control flow effect.  That is, for the
issue of splitting the dataref groups which we currently do on a BB granularity
we could solve this by not splitting groups when we are sure all members of the
group are either executed or not executed (which is the real intent of this
splitting).

In fact the current dataref_group compute seems to be useful only for
making sure to split groups _inside_ BBs at suitable points and the
cross-BB split is ensured by vect_analyze_data_ref_accesses.  We'd need
to enhance the dataref_group computation to be conservative for
cross-BB groups to relax the latter (and for outer loop vect compute it
there as well).  The simplest correctness fix is to ensure the group_id
is bumped when going from one BB to the next.

That would help this case up to encountering the PHIs/ifs which
we only vectorize when they are in the same BB.

inline unsigned opt(unsigned a, unsigned b, unsigned c, unsigned d) {
    return a > b ? c : d;
}

void opt( unsigned * __restrict dst, const unsigned *pa, const unsigned *pb,
        const unsigned *pc, const unsigned  *pd )
{
  unsigned tem = opt(*pa++, *pb++, *pc++, *pd++);
  unsigned tem1 = opt(*pa++, *pb++, *pc++, *pd++);
  unsigned tem2 = opt(*pa++, *pb++, *pc++, *pd++);
  unsigned tem3 = opt(*pa++, *pb++, *pc++, *pd++);
  *dst++ = tem;
  *dst++ = tem1;
  *dst++ = tem2;
  *dst++ = tem3;
}

ends up with

  _35 = {iftmp.24_22, iftmp.24_23, iftmp.24_24, iftmp.24_25};
  vectp.30_34 = dst_26(D);
  MEM <vector(4) unsigned int> [(unsigned int *)vectp.30_34] = _35;

SLP discovery stops at the PHIs which are spread out (and there's still
the loads spread as well).

Reply via email to