On Sun, Jun 10, 2012 at 5:58 PM, William J. Schmidt <wschm...@linux.vnet.ibm.com> wrote: > The fix for PR53331 caused a degradation to 187.facerec on > powerpc64-unknown-linux-gnu. The following simple patch reverses the > degradation without otherwise affecting SPEC cpu2000 or cpu2006. > Bootstrapped and regtested on that platform with no new regressions. Ok > for trunk?
Well, would the real cost not be subparts * scalar_to_vec plus subparts * vec_perm? At least vec_perm isn't the cost for building up a vector from N scalar elements either (it might be close enough if subparts == 2). What's the case with facerec here? Does it have subparts == 2? I really wanted to pessimize this case for say AVX and char elements, thus building up a vector from 32 scalars which certainly does not cost a mere vec_perm. So, maybe special-case the subparts == 2 case and assume vec_perm would match the cost only in that case. Thanks, Richard. > Thanks, > Bill > > > 2012-06-10 Bill Schmidt <wschm...@linux.ibm.com> > > * tree-vect-stmts.c (vect_model_load_cost): Change cost model > for strided loads. > > > Index: gcc/tree-vect-stmts.c > =================================================================== > --- gcc/tree-vect-stmts.c (revision 188341) > +++ gcc/tree-vect-stmts.c (working copy) > @@ -1031,11 +1031,10 @@ vect_model_load_cost (stmt_vec_info stmt_info, int > /* The loads themselves. */ > if (STMT_VINFO_STRIDE_LOAD_P (stmt_info)) > { > - /* N scalar loads plus gathering them into a vector. > - ??? scalar_to_vec isn't the cost for that. */ > + /* N scalar loads plus gathering them into a vector. */ > inside_cost += (vect_get_stmt_cost (scalar_load) * ncopies > * TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info))); > - inside_cost += ncopies * vect_get_stmt_cost (scalar_to_vec); > + inside_cost += ncopies * vect_get_stmt_cost (vec_perm); > } > else > vect_get_load_cost (first_dr, ncopies, > >