Re: [PATCH] i386: Fix up _mm_min_ss etc. handling of zeros and NaNs [PR116738]

Richard Biener Fri, 20 Sep 2024 00:08:43 -0700

On Fri, Sep 20, 2024 at 8:32 AM Jakub Jelinek <ja...@redhat.com> wrote:
>
> On Fri, Sep 20, 2024 at 08:01:58AM +0200, Richard Biener wrote:
> > > P.S. I have a patch to replace UNSPEC_IEEE_M{AX,IN} with IF_THEN_ELSE
> > > (except for the 3dNOW! PFMIN/MAX, those actually are documented to behave
> > > differently), but it actually doesn't improve anything much, as
> > > simplify_const_relational_operation nor simplify_ternary_operation aren't
> > > able to fold comparisons with two CONST_VECTOR operands or IF_THEN_ELSE
> > > with 3 CONST_VECTOR operands.
> > > So, maybe better approach will be to generic fold the builtins with 
> > > constant
> > > arguments (maybe leaving NaNs to runtime).
> >
> > It would be possible to fold them in the gimple folding hook to 
> > VEC_COND_EXPRs
> > with the chance the min/max operation being lost when expanding to RTL.
>
> Sure, but we don't actually pattern recognize
> typedef float v4sf __attribute__((vector_size (sizeof (4 * sizeof (float)))));
>
> v4sf
> foo (v4sf x, v4sf y)
> {
>   return x < y ? y : x;
> }
> back to maxpd etc.


Interesting - we did with GCC 13 but it seems regress with GCC 14 here, even
when using -msse4.2.  But yeah, this is done at RTL expansion and quite fragile
and later the combine patterns are not good enough it seems even with
UNSPEC_BLENDV ...

>  So it wouldn't be an optimization in most cases, at
> least until we do that, user was looking for such insn or better with 
> _mm_max_ps...
> Maybe we should.
>
> For scalar ('-Dvector_size(x)=') this is currently matched in ce2.
>
> Exception-wise, seems the insn raise Invalid on NaN input (either) and if y
> is SNaN, actually propagate it rather than turn it into QNaN, so I think it
> is actually an exact match for x < y ? y : x (or x > y ? y : x).

Yes, it is documented in the Intel ISA manual that way.

Richard.

>         Jakub
>

Re: [PATCH] i386: Fix up _mm_min_ss etc. handling of zeros and NaNs [PR116738]

Reply via email to