> On December 8, 2017 4:51:16 PM GMT+01:00, Jan Hubicka <hubi...@ucw.cz> wrote: > >> > >> This restores the vec_construct cost dependence on the vector element > >> count. Honza removed this (accidentially?) during the rework. > >> > >> Bootstrap / regtest running on x86_64-unknown-linux-gnu, ok for > >trunk? > > > >Hmm, the false parameter to ix86_vec_cost is supposed to do that. It > >uses: > > > >if (!parallel) > > > >return cost * GET_MODE_NUNITS (mode); > > > > > >Why it doesn't work? > > I see. Not exactly the same. I guess we need to see why the costs favor a 16 > element vector in it plus store over 16 scalar stores then...
Isn't it what you proposed as profitable for Martin Jambor's copy by pieces issue? I think the reason is that stores are modeled as having latency (or cost/2) of 4. Construction is modelled as simple sse op (latency 1)*number of parts. So in this simplified model we cummulate latency of 16 stores as 16*4, while construction plus one store as 16*1+4. Honza > > See the PR. > > > > >Honza > >> > >> Thanks, > >> Richard. > >> > >> 2017-12-08 Richard Biener <rguent...@suse.de> > >> > >> PR target/83008 > >> * config/i386/i386.c (ix86_builtin_vectorization_cost): Restore > >> vec_construct dependence on vector element count. > >> > >> Index: gcc/config/i386/i386.c > >> =================================================================== > >> --- gcc/config/i386/i386.c (revision 255499) > >> +++ gcc/config/i386/i386.c (working copy) > >> @@ -44879,7 +44879,8 @@ ix86_builtin_vectorization_cost (enum ve > >> ix86_cost->sse_op, true); > >> > >> case vec_construct: > >> - return ix86_vec_cost (mode, ix86_cost->sse_op, false); > >> + return (ix86_vec_cost (mode, ix86_cost->sse_op, false) > >> + * (TYPE_VECTOR_SUBPARTS (vectype) - 1)); > >> > >> default: > >> gcc_unreachable ();