On Wed, Aug 7, 2019 at 2:20 PM Uros Bizjak <ubiz...@gmail.com> wrote:
>
> On Wed, Aug 7, 2019 at 1:51 PM Richard Biener <rguent...@suse.de> wrote:
> >
> > On Wed, 7 Aug 2019, Richard Biener wrote:
> >
> > > On Mon, 5 Aug 2019, Uros Bizjak wrote:
> > >
> > > > On Mon, Aug 5, 2019 at 3:29 PM Richard Biener <rguent...@suse.de> wrote:
> > > >
> > > > > > > > > > > (define_mode_iterator MAXMIN_IMODE [SI "TARGET_SSE4_1"] 
> > > > > > > > > > > [DI "TARGET_AVX512F"])
> > > > > > > > > > >
> > > > > > > > > > > and then we need to split DImode for 32bits, too.
> > > > > > > > > >
> > > > > > > > > > For now, please add "TARGET_64BIT && TARGET_AVX512F" for 
> > > > > > > > > > DImode
> > > > > > > > > > condition, I'll provide _doubleword splitter later.
> > > > > > > > >
> > > > > > > > > Shouldn't that be TARGET_AVX512VL instead?  Or does the insn 
> > > > > > > > > use %g0 etc.
> > > > > > > > > to force use of %zmmN?
> > > > > > > >
> > > > > > > > It generates V4SI mode, so - yes, AVX512VL.
> > > > > > >
> > > > > > >     case SMAX:
> > > > > > >     case SMIN:
> > > > > > >     case UMAX:
> > > > > > >     case UMIN:
> > > > > > >       if ((mode == DImode && (!TARGET_64BIT || !TARGET_AVX512VL))
> > > > > > >           || (mode == SImode && !TARGET_SSE4_1))
> > > > > > >         return false;
> > > > > > >
> > > > > > > so there's no way to use AVX512VL for 32bit?
> > > > > >
> > > > > > There is a way, but on 32bit targets, we need to split DImode
> > > > > > operation to a sequence of SImode operations for unconverted 
> > > > > > pattern.
> > > > > > This is of course doable, but somehow more complex than simply
> > > > > > emitting a DImode compare + DImode cmove, which is what current
> > > > > > splitter does. So, a follow-up task.
> > > > >
> > > > > Ah, OK.  So for the above condition we can elide the !TARGET_64BIT
> > > > > check we just need to properly split if we enable the scalar minmax
> > > > > pattern for DImode on 32bits, the STV conversion would go fine.
> > > >
> > > > Yes, that is correct.
> > >
> > > So I tested the patch below (now with appropriate ChangeLog) on
> > > x86_64-unknown-linux-gnu.  I've thrown it at SPEC CPU 2006 with
> > > the obvious hmmer improvement, now checking for off-noise results
> > > with a 3-run on those that may have one (with more than +-1 second
> > > differences in the 1-run).
> > >
> > > As-is the patch likely runs into the splitting issue for DImode
> > > on i?86 and the patch misses functional testcases.  I'll do the
> > > hmmer loop with both DImode and SImode and testcases to trigger
> > > all pattern variants with the different ISAs we have.
> > >
> > > Some of the patch could be split out (the cost changes that are
> > > also effective for DImode for example).
> > >
> > > AFAICS we could go with only adding SImode avoiding the DImode
> > > splitting thing and this would solve the hmmer regression.
> >
> > I've additionally bootstrapped with --with-arch=nehalem which
> > reveals
> >
> > FAIL: gcc.target/i386/minmax-2.c scan-assembler test
> > FAIL: gcc.target/i386/minmax-2.c scan-assembler-not cmp
> >
> > we emit cmp + cmov here now with -msse4.1 (as soon as the max
> > pattern is enabled I guess)
>
> Actually, we have to split using ix86_expand_int_compare. This will
> generate optimized CC mode.

So, this only matters for comparisons against zero. Currently, the
insn_and_split pattern allows only registers, but we can add other
types, too. I'd say that this is benign issue.

Uros.

Reply via email to