On Sun, 10 Jul 2016, Uros Bizjak wrote:

> On Wed, Jul 6, 2016 at 3:18 PM, Richard Biener <rguent...@suse.de> wrote:
> 
> >> > 2016-07-04  Richard Biener  <rguent...@suse.de>
> >> >
> >> >     PR rtl-optimization/68961
> >> >     * fwprop.c (propagate_rtx): Allow SUBREGs of VEC_CONCAT and CONCAT
> >> >     to simplify to a non-constant.
> >> >
> >> >     * gcc.target/i386/pr68961.c: New testcase.
> >>
> >> Thanks, LGTM.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, it causes
> >
> > FAIL: gcc.target/i386/sse2-load-multi.c scan-assembler-times movup 2
> >
> > as the peephole created for that testcase no longer applies as fwprop
> > does
> >
> > In insn 10, replacing
> >  (vec_concat:V2DF (vec_select:DF (reg:V2DF 91)
> >             (parallel [
> >                     (const_int 0 [0])
> >                 ]))
> >         (mem:DF (reg/f:DI 95) [0  S8 A128]))
> >  with (vec_concat:V2DF (reg:DF 93 [ MEM[(const double *)&a + 8B] ])
> >         (mem:DF (reg/f:DI 95) [0  S8 A128]))
> > Changed insn 10
> >
> > resulting in
> >
> >         movsd   a+8(%rip), %xmm0
> >         movhpd  a+16(%rip), %xmm0
> >
> > again rather than movupd.
> >
> > Uros, there is probably a missing peephole for the new form - can you
> > fix this as a followup or should I hold on this patch for a bit longer?
> 
> No, please proceed with the patch, I'll fix this fallout with a
> followup patch in a couple of days.

Applied as r238238.  Is the following x86 change ok then which
adjusts the vectorizer vector construction cost to sth more sensible?
I have adjusted the generic implementation in targhooks.c this way
a few weeks ago already.

Thanks,
Richard.

2016-07-12  Richard Biener  <rguent...@suse.de>

        * targhooks.c (default_builtin_vectorization_cost): Adjust
        vec_construct cost.
        * config/i386/i386.c (ix86_builtin_vectorization_cost): Likewise.

Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c      (revision 237196)
+++ gcc/config/i386/i386.c      (working copy)
@@ -49503,8 +49520,6 @@ static int
 ix86_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
                                  tree vectype, int)
 {
-  unsigned elements;
-
   switch (type_of_cost)
     {
       case scalar_stmt:
@@ -49546,8 +49561,7 @@ ix86_builtin_vectorization_cost (enum ve
         return ix86_cost->vec_stmt_cost;
 
       case vec_construct:
-       elements = TYPE_VECTOR_SUBPARTS (vectype);
-       return ix86_cost->vec_stmt_cost * (elements / 2 + 1);
+       return ix86_cost->vec_stmt_cost * (TYPE_VECTOR_SUBPARTS (vectype) - 1);
 
       default:
         gcc_unreachable ();

Reply via email to