https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92651
--- Comment #6 from rguenther at suse dot de <rguenther at suse dot de> --- On Tue, 26 Nov 2019, wwwhhhyyy333 at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92651 > > --- Comment #4 from Hongyu Wang <wwwhhhyyy333 at gmail dot com> --- > (In reply to Richard Biener from comment #2) > > Btw, which variant is actually the fastest for you? abs expansion doesn't > > do any cost comparison but just uses direct abs, max and then the xor with > > shift as third option (and after that fall back to compare & jump which > > later > > might be if-converted into cmov). > > Actually the xor with shift is could be the fastest, which improves > about 8% on 525.x264_r comparing to the pmaxsd one, and with cmove the > improvement is 6.5%. I see. So I wonder if it makes sense to add some costing checks to abs expansion... - the simplest way is probably to make the x86 backends have abs patterns and drive expansion itself here. > I don't think this conversion should happen on every cmove instruction, > regardless of how many sse register it would use. I think the simplest way to > avoid this is adjusting the cost. Well, for STV the issue is that "costing" is done on individual chains. Note that STV doesn't transform cmovs, it transforms min/max instructions which exist on integer modes just for the sake of STV ... STV (like many other combine-like transforms) doesn't consider the global picture (multiple min/max chains in the same code region, etc.) but only works locally. So any costing wrenches you throw in has an effect on _all_ chains. Clearly abs expansion had a successful non-cmov path before the STV changes and the intention was not to make min/max the new abs expansion of choice. So I guess we need to rectify that - and the easiest and least intrusive way (for other targets) is to add abs expansion patterns.