13 Regression] Enable vectorizer generates extra load

hjl.tools at gmail dot com via Gcc-bugs Fri, 24 Jun 2022 14:22:33 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106022


--- Comment #12 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Richard Biener from comment #11)
> (In reply to H.J. Lu from comment #9)
> > (In reply to Richard Biener from comment #8)
> > > (In reply to H.J. Lu from comment #6)
> > > > Created attachment 53169 [details]
> > > > A patch
> > > > 
> > > > This patch multiplies the vector store cost by the number of scalar 
> > > > elements
> > > > in
> > > > a word to properly compare scalar store cost against vector store cost.
> > > 
> > > But that's not "properly" but "wrong" ...
> > > 
> > > Note we already cost the vector load from the constant pool so the vector
> > > side costing is correct.
> > > 
> > > What's eventually imprecise is the scalar cost where you could anticipate
> > > store merging, but adjusting the vector cost side is just wrong.
> > 
> > I tried to adjust the scalar cost.  When the scalar cost of storing a byte
> > is 6, dividing it by 8 (the number of scalar elements in a word) becomes 0.
> > Will it work?
> 
> No, I think you would need to pattern match an actual store sequence,
> for example by looking at
> 
>  if (STMT_VINFO_GROUPED_ACCESS (stmt_info)
>      && pow2p_hwi (DR_GROUP_STORE_COUNT (stmt_info)))
>    /* cost a possibly merged store only once (but with larger mode?) */
>    if (DR_GROUP_FIRST_ELEMENT (stmt_info) == stmt_info)
>      ...

The information aren't available in add_stmt_cost.  I will
count number of scalar stores and vector stores.  Then I will
compare them in finish_cost.

> So costing the whole sequence of scalar stores a single time, with
> adjusted mode.
> 
> store-merging also handles non-QImode stores btw.

[Bug target/106022] [12/13 Regression] Enable vectorizer generates extra load

Reply via email to