On Sat, Jun 1, 2019 at 6:08 AM Jeff Law <l...@redhat.com> wrote:
>
> On 5/30/19 2:53 AM, Hongtao Liu wrote:
> > On Thu, May 30, 2019 at 3:23 AM Jeff Law <l...@redhat.com> wrote:
> >> On 5/9/19 10:54 PM, Hongtao Liu wrote:
> >>> On Fri, May 10, 2019 at 3:55 AM Jeff Law <l...@redhat.com> wrote:
> >>>> On 5/6/19 11:38 PM, Hongtao Liu wrote:
> >>>>> Hi Uros and GCC:
> >>>>>   This patch is to fix ix86_expand_sse_comi_round whose implementation
> >>>>> was not correct.
> >>>>>   New implentation aligns with _mm_cmp_round_s[sd]_mask.
> >>>>>
> >>>>> Bootstrap and regression tests for x86 is fine.
> >>>>> Ok for trunk?
> >>>>>
> >>>>>
> >>>>> ChangeLog:
> >>>>> gcc/
> >>>>>    * config/i386/i386-expand.c (ix86_expand_sse_comi_round):
> >>>>>    Modified, original implementation isn't correct.
> >>>>>
> >>>>> gcc/testsuite
> >>>>>    * gcc.target/i386/avx512f-vcomisd-2.c: New runtime tests.
> >>>>>    * gcc.target/i386/avx512f-vcomisd-2.c: Likewise.
> >>>> So you'll have to bear with me, I'm not really familiar with this code,
> >>>> but in the absence of a maintainer I'll try to work through it.
> >>>>
> >>>>
> >>>>> -- BR, Hongtao
> >>>>>
> >>>>>
> >>>>> 0001-Fix-ix86_expand_sse_comi_round.patch
> >>>>>
> >>>>> Index: gcc/ChangeLog
> >>>>> ===================================================================
> >>>>> --- gcc/ChangeLog     (revision 270933)
> >>>>> +++ gcc/ChangeLog     (working copy)
> >>>>> @@ -1,3 +1,11 @@
> >>>>> +2019-05-06  H.J. Lu  <hongjiu...@intel.com>
> >>>>> +         Hongtao Liu  <hongtao....@intel.com>
> >>>>> +
> >>>>> +     PR Target/89750
> >>>>> +     PR Target/86444
> >>>>> +     * config/i386/i386-expand.c (ix86_expand_sse_comi_round):
> >>>>> +     Modified, original implementation isn't correct.
> >>>>> +
> >>>>>  2019-05-06  Segher Boessenkool  <seg...@kernel.crashing.org>
> >>>>>
> >>>>>       * config/rs6000/rs6000.md (FIRST_ALTIVEC_REGNO, 
> >>>>> LAST_ALTIVEC_REGNO)
> >>>>> Index: gcc/config/i386/i386-expand.c
> >>>>> ===================================================================
> >>>>> --- gcc/config/i386/i386-expand.c     (revision 270933)
> >>>>> +++ gcc/config/i386/i386-expand.c     (working copy)
> >>>>> @@ -9853,18 +9853,24 @@
> >>>>>    const struct insn_data_d *insn_p = &insn_data[icode];
> >>>>>    machine_mode mode0 = insn_p->operand[0].mode;
> >>>>>    machine_mode mode1 = insn_p->operand[1].mode;
> >>>>> -  enum rtx_code comparison = UNEQ;
> >>>>> -  bool need_ucomi = false;
> >>>>>
> >>>>>    /* See avxintrin.h for values.  */
> >>>>> -  enum rtx_code comi_comparisons[32] =
> >>>>> +  static const enum rtx_code comparisons[32] =
> >>>> So I assume the comment refers to the _CMP_* #defines in avxintrin.h?
> >>>>
> >>>   Yes.
> >>>>>      {
> >>>>> -      UNEQ, GT, GE, UNORDERED, LTGT, UNLE, UNLT, ORDERED, UNEQ, UNLT,
> >>>>> -      UNLE, LT, LTGT, GE, GT, LT, UNEQ, GT, GE, UNORDERED, LTGT, UNLE,
> >>>>> -      UNLT, ORDERED, UNEQ, UNLT, UNLE, LT, LTGT, GE, GT, LT
> >>>>> +      EQ, LT, LE, UNORDERED, NE, UNGE, UNGT, ORDERED,
> >>>>> +      EQ, UNLT, UNLE, UNORDERED, LTGT, GE, GT, ORDERED,
> >>>>> +      EQ, LT, LE, UNORDERED, NE, UNGE, UNGT, ORDERED,
> >>>>> +      EQ, UNLT, UNLE, UNORDERED, LTGT, GE, GT, ORDERED
> >>>>>      };
> >>>> For CMP_EQ_UQ aren't we looking for an unordered comparison, so UNEQ
> >>>> seems right, but you're using EQ.  Can you double-check this?  If it's
> >>>> wrong, then please make sure we cover this case with a test.
> >>>>
> >>> Avx512f-vcomis[sd]-2.c covers all 32 compare predicates.
> >>> UNEQ and EQ behave differently when either operand is NAN, besides
> >>> they're the same.
> >>> Since NAN operands are handled separtely, so EQ/UNEQ makes no
> >>> difference, That why this passes cover tests.
> >>> I'll correct it.
> >> Ah.  Thanks.  FWIW my approach was to walk through the _CMP_* defines in
> >> avxintrin.h and map that to what I thought the comparison ought to be.
> >> Then I reviewed my result against your patch.  I got a couple wrong, but
> >> could easily see my mistake.  The only one I couldn't reconcile was the
> >> CMP_EQ_UQ.  Knowing the NaNs are handled separately makes it clear.
> >> Thanks gain.
> >>
> >>
> >>
> >>>>
> >>>>
> >>>>> @@ -9932,11 +10021,37 @@
> >>>>>      }
> >>>>>
> >>>>>    emit_insn (pat);
> >>>>> +
> >>>>> +  /* XXX: Set CCFPmode and check a different CCmode.  Does it work
> >>>>> +     correctly?  */
> >>>>> +  if (GET_MODE (set_dst) != mode)
> >>>>> +    set_dst = gen_rtx_REG (mode, REGNO (set_dst));
> >>>> This looks worrisome, even without the cryptic comment.  I don't think
> >>>> you can just blindly change the mode like that.  Unless you happen to
> >>>> know that the only things you test in the new mode were set in precisely
> >>>> the same way as the old mode.
> >>>>
> >>> Modified as:
> >>> +  /* NB: Set CCFPmode and check a different CCmode.  */
> >>> +  if (GET_MODE (set_dst) != mode)
> >>> +    set_dst = gen_rtx_REG (mode, FLAGS_REG);
> >> That might actually be worse.  The mode carries semantic information
> >> about where to find the various condition codes within the flags
> >> register and which condition codes are valid.  The register number
> >> determines which (of possibly many) flags registers we are querying.
> >>
> >> Thus if the mode of SET_DEST is not the same as MODE, then there is a
> >> mismatch between the point where the condition codes were set and where
> >> we want to use them.
> >>
> >> That can only be safe is the condition codes set in the new mode are a
> >> strict subset of the condition codes set in the old mode and they're
> >> found in the same place within the same flags register.
> >>
> > vcomiss/vcomisd will set ZF,PF,CF,OF,SF,AF flags in  FLAGS_REG which is 
> > treated
> > as CCFPmode. As you mentioned, It's safe to use subset of the
> > condition codes in a
> > new mode,  in the same FLAGS_REG,  so I've modified as
> >
> > 1. Use CCZmode instead of CCmode.
> >
> > +    case EQ:
> > +      /* NB: COMI/UCOMI will set ZF with NAN operands.  Use CCZmode for
> > + _CMP_EQ_OQ/_CMP_EQ_OS.  */
> > +      check_unordered = true;
> > +      mode = CCZmode;
> > +      break;
> > +    case NE:
> > +      /* NB: COMI/UCOMI will set ZF with NAN operands.  Use CCZmode for
> > + _CMP_NEQ_UQ/_CMP_NEQ_US.  */
> > +      gcc_assert (!ordered);
> > +      check_unordered = true;
> > +      mode = CCZmode;
> > +      const_val = const1_rtx;
> >
> > 2. I've double checked that condition codes in new mode must be subset
> > of that in old mode.
> > Also gcc_assert is added to guard.
> >
> > +  /* NB: Set CCFPmode and check a different CCmode which is in subset
> > +     of CCFPmode.  */
> > +  if (GET_MODE (set_dst) != mode)
> > +    {
> > +      gcc_assert (mode == CCAmode || mode == CCCmode
> > +   || mode == CCOmode || mode == CCPmode
> > +   || mode == CCSmode || mode == CCZmode);
> > +      set_dst = gen_rtx_REG (mode, FLAGS_REG);
> > +    }
> > +
> >
> >
> >> Maybe a simple example would help.  Consider a fairly standard target
> >> with arithmetic insns that set ZNV and logicals that set ZN and clobber V.
> >>
> >> Arithmetic insns will have a set of the flags register with one mode
> >> (CC_ZNV for example) and logicals would use a different mode for the
> >> flags register (CC_ZN for example).
> >>
> >> Now consider if the architecture has memory bit operations.  Ie, you can
> >> query the state of a bit in memory.  Unfortunately the result is stored
> >> in the C bit of the condition code register (rather than in the Z bit
> >> like you might hope).  Yes, this comes from a real world example.
> >>
> >> During compare elimination, the compiler will try to eliminate a compare
> >> insn and instead use the condition codes set by a prior insn such as an
> >> arithmetic, logical, bit test, etc.  The mode of the flags register will
> >> indicate which bits are valid and which bits can be tested.  So for our
> >> hypothetical architecture we might have modes CC_ZNV, CC_ZN, CC_Z_IN_C
> >> for arithmetic, logical and bit testing.
> >>
> >> If we are using the result of a prior bit test to eliminate a compare,
> >> the mode of the flags register on the bit test would be CC_Z_N_C as
> >> would the mode of the conditional branch.
> >>
> >> We can not blindly change the mode to CC_ZNV because the bit we want is
> >> not in the ZNV  bits of the flag register, it's actually in the C bit.
> >>
> >> Things might be even more complicated if there's a separate register for
> >> floating point condition codes or multiple integer condition code 
> >> registers.
> >>
> >> Hopefully that makes things clear in terms of how the mode of the flags
> >> register is used as well as the register number.  You'd have to dig into
> >> the x86 port's use of the flags register to know what changes are safe
> >> and which aren't -- and that's the analysis I'd need to help move this
> >> along.
> >>
> >> Jeff
> >>
> > Thanks a lot for your detailed explanation, It helps me a lot.
> >
> > Here is updated patch.
> > -- BR, Hongtao
> >
> >
> > 0001-Fix-ix86_expand_sse_comi_round_v3.patch
> >
> > Index: gcc/ChangeLog
> > ===================================================================
> > --- gcc/ChangeLog     (revision 271761)
> > +++ gcc/ChangeLog     (working copy)
> > @@ -2372,6 +2372,14 @@
> >       * tree-ssa-phiopt.c (two_value_replacement): Fix a typo in parameter
> >       detection.
> >
> > +2019-05-06  H.J. Lu  <hongjiu...@intel.com>
> > +         Hongtao Liu  <hongtao....@intel.com>
> > +
> > +     PR target/89750
> > +     PR target/86444
> > +     * config/i386/i386-expand.c (ix86_expand_sse_comi_round):
> > +     Modified, original implementation isn't correct.
> OK for the trunk.
>
> Thanks for your patience,
> jeff

Author: liuhongt
Date: Mon Jun  3 02:20:33 2019
New Revision: 271853

URL: https://gcc.gnu.org/viewcvs?rev=271853&root=gcc&view=rev
Log:
2019-05-06  H.J. Lu  <hongjiu...@intel.com>
   Hongtao Liu  <hongtao....@intel.com>

PR target/89750
PR target/86444
* config/i386/i386-expand.c (ix86_expand_sse_comi_round):
Modified, original implementation isn't correct.

2019-05-06  H.J. Lu  <hongjiu...@intel.com>
   Hongtao Liu  <hongtao....@intel.com>

PR target/89750
PR target/86444
* gcc.target/i386/avx512f-vcomisd-2.c: New.
* gcc.target/i386/avx512f-vcomisd-2.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.target/i386/avx512f-vcomisd-2.c
    trunk/gcc/testsuite/gcc.target/i386/avx512f-vcomiss-2.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386-expand.c
    trunk/gcc/testsuite/ChangeLog


-- 
BR,
Hongtao

Reply via email to