https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92651

--- Comment #6 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 26 Nov 2019, wwwhhhyyy333 at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92651
> 
> --- Comment #4 from Hongyu Wang <wwwhhhyyy333 at gmail dot com> ---
> (In reply to Richard Biener from comment #2)
> > Btw, which variant is actually the fastest for you?   abs expansion doesn't
> > do any cost comparison but just uses direct abs, max and then the xor with
> > shift as third option (and after that fall back to compare & jump which 
> > later
> > might be if-converted into cmov).
> 
> Actually the xor with shift is could be the fastest, which improves 
> about 8% on 525.x264_r comparing to the pmaxsd one, and with cmove the 
> improvement is 6.5%.

I see.  So I wonder if it makes sense to add some costing checks to
abs expansion... - the simplest way is probably to make the x86 backends
have abs patterns and drive expansion itself here.

> I don't think this conversion should happen on every cmove instruction,
> regardless of how many sse register it would use. I think the simplest way to
> avoid this is adjusting the cost.

Well, for STV the issue is that "costing" is done on individual
chains.  Note that STV doesn't transform cmovs, it transforms min/max
instructions which exist on integer modes just for the sake of STV ...

STV (like many other combine-like transforms) doesn't consider the
global picture (multiple min/max chains in the same code region, etc.)
but only works locally.  So any costing wrenches you throw in has
an effect on _all_ chains.

Clearly abs expansion had a successful non-cmov path before the STV
changes and the intention was not to make min/max the new abs expansion
of choice.  So I guess we need to rectify that - and the easiest and
least intrusive way (for other targets) is to add abs expansion
patterns.

Reply via email to