On Sun, Jun 10, 2012 at 5:58 PM, William J. Schmidt
<wschm...@linux.vnet.ibm.com> wrote:
> The fix for PR53331 caused a degradation to 187.facerec on
> powerpc64-unknown-linux-gnu.  The following simple patch reverses the
> degradation without otherwise affecting SPEC cpu2000 or cpu2006.
> Bootstrapped and regtested on that platform with no new regressions.  Ok
> for trunk?

Well, would the real cost not be subparts * scalar_to_vec plus
subparts * vec_perm?
At least vec_perm isn't the cost for building up a vector from N scalar elements
either (it might be close enough if subparts == 2).  What's the case
with facerec
here?  Does it have subparts == 2?  I really wanted to pessimize this case
for say AVX and char elements, thus building up a vector from 32 scalars which
certainly does not cost a mere vec_perm.  So, maybe special-case the
subparts == 2 case and assume vec_perm would match the cost only in that
case.

Thanks,
Richard.

> Thanks,
> Bill
>
>
> 2012-06-10  Bill Schmidt  <wschm...@linux.ibm.com>
>
>        * tree-vect-stmts.c (vect_model_load_cost):  Change cost model
>        for strided loads.
>
>
> Index: gcc/tree-vect-stmts.c
> ===================================================================
> --- gcc/tree-vect-stmts.c       (revision 188341)
> +++ gcc/tree-vect-stmts.c       (working copy)
> @@ -1031,11 +1031,10 @@ vect_model_load_cost (stmt_vec_info stmt_info, int
>   /* The loads themselves.  */
>   if (STMT_VINFO_STRIDE_LOAD_P (stmt_info))
>     {
> -      /* N scalar loads plus gathering them into a vector.
> -         ???  scalar_to_vec isn't the cost for that.  */
> +      /* N scalar loads plus gathering them into a vector.  */
>       inside_cost += (vect_get_stmt_cost (scalar_load) * ncopies
>                      * TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info)));
> -      inside_cost += ncopies * vect_get_stmt_cost (scalar_to_vec);
> +      inside_cost += ncopies * vect_get_stmt_cost (vec_perm);
>     }
>   else
>     vect_get_load_cost (first_dr, ncopies,
>
>

Reply via email to