------- Comment #5 from ubizjak at gmail dot com  2008-08-04 11:43 -------
Hm, following testcase doesn't vectorize due to vect cost model
(-O2 -msse3 -ftree-vectorize -ffast-math) on i686 target:

--cut here--
void testf(void)
{
  int i;

  for (i = 0; i < 16; i++)
    cf[i] = af[i] + bf[i];
}
--cut here--


Compilation reports:

pr30211.c:8: note: vectorization_factor = 2, niters = 16
pr30211.c:8: note: === vect_update_slp_costs_according_to_vf ===
pr30211.c:8: note: cost model: vector iteration cost = 16 is divisible by
scalar iteration cost = 8 by a factor greater than or equal to the
vectorization factor = 2 .
pr30211.c:8: note: not vectorized: vectorization not profitable.
pr30211.c:8: note: not vectorized: vector version will never be profitable.

However, without cost model the loop in this testcase compiles to:

.L2:
        movaps  bf(%eax), %xmm0
        addps   af(%eax), %xmm0
        movaps  %xmm0, cf(%eax)
        addl    $16, %eax
        cmpl    $128, %eax
        jne     .L2

which is IMO faster than equivalent scalar version:

.L2:
        movss   bf+4(,%eax,8), %xmm1
        addss   af+4(,%eax,8), %xmm1
        movss   bf(,%eax,8), %xmm0
        addss   af(,%eax,8), %xmm0
        movss   %xmm0, cf(,%eax,8)
        movss   %xmm1, cf+4(,%eax,8)
        addl    $1, %eax
        cmpl    $16, %eax
        jne     .L2


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35252

Reply via email to