On Sat, Jun 1, 2019 at 6:08 AM Jeff Law <l...@redhat.com> wrote: > > On 5/30/19 2:53 AM, Hongtao Liu wrote: > > On Thu, May 30, 2019 at 3:23 AM Jeff Law <l...@redhat.com> wrote: > >> On 5/9/19 10:54 PM, Hongtao Liu wrote: > >>> On Fri, May 10, 2019 at 3:55 AM Jeff Law <l...@redhat.com> wrote: > >>>> On 5/6/19 11:38 PM, Hongtao Liu wrote: > >>>>> Hi Uros and GCC: > >>>>> This patch is to fix ix86_expand_sse_comi_round whose implementation > >>>>> was not correct. > >>>>> New implentation aligns with _mm_cmp_round_s[sd]_mask. > >>>>> > >>>>> Bootstrap and regression tests for x86 is fine. > >>>>> Ok for trunk? > >>>>> > >>>>> > >>>>> ChangeLog: > >>>>> gcc/ > >>>>> * config/i386/i386-expand.c (ix86_expand_sse_comi_round): > >>>>> Modified, original implementation isn't correct. > >>>>> > >>>>> gcc/testsuite > >>>>> * gcc.target/i386/avx512f-vcomisd-2.c: New runtime tests. > >>>>> * gcc.target/i386/avx512f-vcomisd-2.c: Likewise. > >>>> So you'll have to bear with me, I'm not really familiar with this code, > >>>> but in the absence of a maintainer I'll try to work through it. > >>>> > >>>> > >>>>> -- BR, Hongtao > >>>>> > >>>>> > >>>>> 0001-Fix-ix86_expand_sse_comi_round.patch > >>>>> > >>>>> Index: gcc/ChangeLog > >>>>> =================================================================== > >>>>> --- gcc/ChangeLog (revision 270933) > >>>>> +++ gcc/ChangeLog (working copy) > >>>>> @@ -1,3 +1,11 @@ > >>>>> +2019-05-06 H.J. Lu <hongjiu...@intel.com> > >>>>> + Hongtao Liu <hongtao....@intel.com> > >>>>> + > >>>>> + PR Target/89750 > >>>>> + PR Target/86444 > >>>>> + * config/i386/i386-expand.c (ix86_expand_sse_comi_round): > >>>>> + Modified, original implementation isn't correct. > >>>>> + > >>>>> 2019-05-06 Segher Boessenkool <seg...@kernel.crashing.org> > >>>>> > >>>>> * config/rs6000/rs6000.md (FIRST_ALTIVEC_REGNO, > >>>>> LAST_ALTIVEC_REGNO) > >>>>> Index: gcc/config/i386/i386-expand.c > >>>>> =================================================================== > >>>>> --- gcc/config/i386/i386-expand.c (revision 270933) > >>>>> +++ gcc/config/i386/i386-expand.c (working copy) > >>>>> @@ -9853,18 +9853,24 @@ > >>>>> const struct insn_data_d *insn_p = &insn_data[icode]; > >>>>> machine_mode mode0 = insn_p->operand[0].mode; > >>>>> machine_mode mode1 = insn_p->operand[1].mode; > >>>>> - enum rtx_code comparison = UNEQ; > >>>>> - bool need_ucomi = false; > >>>>> > >>>>> /* See avxintrin.h for values. */ > >>>>> - enum rtx_code comi_comparisons[32] = > >>>>> + static const enum rtx_code comparisons[32] = > >>>> So I assume the comment refers to the _CMP_* #defines in avxintrin.h? > >>>> > >>> Yes. > >>>>> { > >>>>> - UNEQ, GT, GE, UNORDERED, LTGT, UNLE, UNLT, ORDERED, UNEQ, UNLT, > >>>>> - UNLE, LT, LTGT, GE, GT, LT, UNEQ, GT, GE, UNORDERED, LTGT, UNLE, > >>>>> - UNLT, ORDERED, UNEQ, UNLT, UNLE, LT, LTGT, GE, GT, LT > >>>>> + EQ, LT, LE, UNORDERED, NE, UNGE, UNGT, ORDERED, > >>>>> + EQ, UNLT, UNLE, UNORDERED, LTGT, GE, GT, ORDERED, > >>>>> + EQ, LT, LE, UNORDERED, NE, UNGE, UNGT, ORDERED, > >>>>> + EQ, UNLT, UNLE, UNORDERED, LTGT, GE, GT, ORDERED > >>>>> }; > >>>> For CMP_EQ_UQ aren't we looking for an unordered comparison, so UNEQ > >>>> seems right, but you're using EQ. Can you double-check this? If it's > >>>> wrong, then please make sure we cover this case with a test. > >>>> > >>> Avx512f-vcomis[sd]-2.c covers all 32 compare predicates. > >>> UNEQ and EQ behave differently when either operand is NAN, besides > >>> they're the same. > >>> Since NAN operands are handled separtely, so EQ/UNEQ makes no > >>> difference, That why this passes cover tests. > >>> I'll correct it. > >> Ah. Thanks. FWIW my approach was to walk through the _CMP_* defines in > >> avxintrin.h and map that to what I thought the comparison ought to be. > >> Then I reviewed my result against your patch. I got a couple wrong, but > >> could easily see my mistake. The only one I couldn't reconcile was the > >> CMP_EQ_UQ. Knowing the NaNs are handled separately makes it clear. > >> Thanks gain. > >> > >> > >> > >>>> > >>>> > >>>>> @@ -9932,11 +10021,37 @@ > >>>>> } > >>>>> > >>>>> emit_insn (pat); > >>>>> + > >>>>> + /* XXX: Set CCFPmode and check a different CCmode. Does it work > >>>>> + correctly? */ > >>>>> + if (GET_MODE (set_dst) != mode) > >>>>> + set_dst = gen_rtx_REG (mode, REGNO (set_dst)); > >>>> This looks worrisome, even without the cryptic comment. I don't think > >>>> you can just blindly change the mode like that. Unless you happen to > >>>> know that the only things you test in the new mode were set in precisely > >>>> the same way as the old mode. > >>>> > >>> Modified as: > >>> + /* NB: Set CCFPmode and check a different CCmode. */ > >>> + if (GET_MODE (set_dst) != mode) > >>> + set_dst = gen_rtx_REG (mode, FLAGS_REG); > >> That might actually be worse. The mode carries semantic information > >> about where to find the various condition codes within the flags > >> register and which condition codes are valid. The register number > >> determines which (of possibly many) flags registers we are querying. > >> > >> Thus if the mode of SET_DEST is not the same as MODE, then there is a > >> mismatch between the point where the condition codes were set and where > >> we want to use them. > >> > >> That can only be safe is the condition codes set in the new mode are a > >> strict subset of the condition codes set in the old mode and they're > >> found in the same place within the same flags register. > >> > > vcomiss/vcomisd will set ZF,PF,CF,OF,SF,AF flags in FLAGS_REG which is > > treated > > as CCFPmode. As you mentioned, It's safe to use subset of the > > condition codes in a > > new mode, in the same FLAGS_REG, so I've modified as > > > > 1. Use CCZmode instead of CCmode. > > > > + case EQ: > > + /* NB: COMI/UCOMI will set ZF with NAN operands. Use CCZmode for > > + _CMP_EQ_OQ/_CMP_EQ_OS. */ > > + check_unordered = true; > > + mode = CCZmode; > > + break; > > + case NE: > > + /* NB: COMI/UCOMI will set ZF with NAN operands. Use CCZmode for > > + _CMP_NEQ_UQ/_CMP_NEQ_US. */ > > + gcc_assert (!ordered); > > + check_unordered = true; > > + mode = CCZmode; > > + const_val = const1_rtx; > > > > 2. I've double checked that condition codes in new mode must be subset > > of that in old mode. > > Also gcc_assert is added to guard. > > > > + /* NB: Set CCFPmode and check a different CCmode which is in subset > > + of CCFPmode. */ > > + if (GET_MODE (set_dst) != mode) > > + { > > + gcc_assert (mode == CCAmode || mode == CCCmode > > + || mode == CCOmode || mode == CCPmode > > + || mode == CCSmode || mode == CCZmode); > > + set_dst = gen_rtx_REG (mode, FLAGS_REG); > > + } > > + > > > > > >> Maybe a simple example would help. Consider a fairly standard target > >> with arithmetic insns that set ZNV and logicals that set ZN and clobber V. > >> > >> Arithmetic insns will have a set of the flags register with one mode > >> (CC_ZNV for example) and logicals would use a different mode for the > >> flags register (CC_ZN for example). > >> > >> Now consider if the architecture has memory bit operations. Ie, you can > >> query the state of a bit in memory. Unfortunately the result is stored > >> in the C bit of the condition code register (rather than in the Z bit > >> like you might hope). Yes, this comes from a real world example. > >> > >> During compare elimination, the compiler will try to eliminate a compare > >> insn and instead use the condition codes set by a prior insn such as an > >> arithmetic, logical, bit test, etc. The mode of the flags register will > >> indicate which bits are valid and which bits can be tested. So for our > >> hypothetical architecture we might have modes CC_ZNV, CC_ZN, CC_Z_IN_C > >> for arithmetic, logical and bit testing. > >> > >> If we are using the result of a prior bit test to eliminate a compare, > >> the mode of the flags register on the bit test would be CC_Z_N_C as > >> would the mode of the conditional branch. > >> > >> We can not blindly change the mode to CC_ZNV because the bit we want is > >> not in the ZNV bits of the flag register, it's actually in the C bit. > >> > >> Things might be even more complicated if there's a separate register for > >> floating point condition codes or multiple integer condition code > >> registers. > >> > >> Hopefully that makes things clear in terms of how the mode of the flags > >> register is used as well as the register number. You'd have to dig into > >> the x86 port's use of the flags register to know what changes are safe > >> and which aren't -- and that's the analysis I'd need to help move this > >> along. > >> > >> Jeff > >> > > Thanks a lot for your detailed explanation, It helps me a lot. > > > > Here is updated patch. > > -- BR, Hongtao > > > > > > 0001-Fix-ix86_expand_sse_comi_round_v3.patch > > > > Index: gcc/ChangeLog > > =================================================================== > > --- gcc/ChangeLog (revision 271761) > > +++ gcc/ChangeLog (working copy) > > @@ -2372,6 +2372,14 @@ > > * tree-ssa-phiopt.c (two_value_replacement): Fix a typo in parameter > > detection. > > > > +2019-05-06 H.J. Lu <hongjiu...@intel.com> > > + Hongtao Liu <hongtao....@intel.com> > > + > > + PR target/89750 > > + PR target/86444 > > + * config/i386/i386-expand.c (ix86_expand_sse_comi_round): > > + Modified, original implementation isn't correct. > OK for the trunk. > > Thanks for your patience, > jeff
Author: liuhongt Date: Mon Jun 3 02:20:33 2019 New Revision: 271853 URL: https://gcc.gnu.org/viewcvs?rev=271853&root=gcc&view=rev Log: 2019-05-06 H.J. Lu <hongjiu...@intel.com> Hongtao Liu <hongtao....@intel.com> PR target/89750 PR target/86444 * config/i386/i386-expand.c (ix86_expand_sse_comi_round): Modified, original implementation isn't correct. 2019-05-06 H.J. Lu <hongjiu...@intel.com> Hongtao Liu <hongtao....@intel.com> PR target/89750 PR target/86444 * gcc.target/i386/avx512f-vcomisd-2.c: New. * gcc.target/i386/avx512f-vcomisd-2.c: Likewise. Added: trunk/gcc/testsuite/gcc.target/i386/avx512f-vcomisd-2.c trunk/gcc/testsuite/gcc.target/i386/avx512f-vcomiss-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386-expand.c trunk/gcc/testsuite/ChangeLog -- BR, Hongtao