On 06/09/17 14:17, Bernd Edlinger wrote: > On 09/06/17 14:51, Richard Earnshaw (lists) wrote: >> On 06/09/17 13:44, Bernd Edlinger wrote: >>> On 09/04/17 21:54, Bernd Edlinger wrote: >>>> Hi Kyrill, >>>> >>>> Thanks for your review! >>>> >>>> >>>> On 09/04/17 15:55, Kyrill Tkachov wrote: >>>>> Hi Bernd, >>>>> >>>>> On 18/01/17 15:36, Bernd Edlinger wrote: >>>>>> On 01/13/17 19:28, Bernd Edlinger wrote: >>>>>>> On 01/13/17 17:10, Bernd Edlinger wrote: >>>>>>>> On 01/13/17 14:50, Richard Earnshaw (lists) wrote: >>>>>>>>> On 18/12/16 12:58, Bernd Edlinger wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> this is related to PR77308, the follow-up patch will depend on this >>>>>>>>>> one. >>>>>>>>>> >>>>>>>>>> When trying the split the *arm_cmpdi_insn and *arm_cmpdi_unsigned >>>>>>>>>> before reload, a mis-compilation in libgcc function >>>>>>>>>> __gnu_satfractdasq >>>>>>>>>> was discovered, see [1] for more details. >>>>>>>>>> >>>>>>>>>> The reason seems to be that when the *arm_cmpdi_insn is directly >>>>>>>>>> followed by a *arm_cmpdi_unsigned instruction, both are split >>>>>>>>>> up into this: >>>>>>>>>> >>>>>>>>>> [(set (reg:CC CC_REGNUM) >>>>>>>>>> (compare:CC (match_dup 0) (match_dup 1))) >>>>>>>>>> (parallel [(set (reg:CC CC_REGNUM) >>>>>>>>>> (compare:CC (match_dup 3) (match_dup 4))) >>>>>>>>>> (set (match_dup 2) >>>>>>>>>> (minus:SI (match_dup 5) >>>>>>>>>> (ltu:SI (reg:CC_C CC_REGNUM) >>>>>>>>>> (const_int >>>>>>>>>> 0))))])] >>>>>>>>>> >>>>>>>>>> [(set (reg:CC CC_REGNUM) >>>>>>>>>> (compare:CC (match_dup 2) (match_dup 3))) >>>>>>>>>> (cond_exec (eq:SI (reg:CC CC_REGNUM) (const_int 0)) >>>>>>>>>> (set (reg:CC CC_REGNUM) >>>>>>>>>> (compare:CC (match_dup 0) (match_dup 1))))] >>>>>>>>>> >>>>>>>>>> The problem is that the reg:CC from the *subsi3_carryin_compare >>>>>>>>>> is not mentioning that the reg:CC is also dependent on the reg:CC >>>>>>>>>> from before. Therefore the *arm_cmpsi_insn appears to be >>>>>>>>>> redundant and thus got removed, because the data values are >>>>>>>>>> identical. >>>>>>>>>> >>>>>>>>>> I think that applies to a number of similar pattern where data >>>>>>>>>> flow is happening through the CC reg. >>>>>>>>>> >>>>>>>>>> So this is a kind of correctness issue, and should be fixed >>>>>>>>>> independently from the optimization issue PR77308. >>>>>>>>>> >>>>>>>>>> Therefore I think the patterns need to specify the true >>>>>>>>>> value that will be in the CC reg, in order for cse to >>>>>>>>>> know what the instructions are really doing. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Bootstrapped and reg-tested on arm-linux-gnueabihf. >>>>>>>>>> Is it OK for trunk? >>>>>>>>>> >>>>>>>>> I agree you've found a valid problem here, but I have some issues >>>>>>>>> with >>>>>>>>> the patch itself. >>>>>>>>> >>>>>>>>> >>>>>>>>> (define_insn_and_split "subdi3_compare1" >>>>>>>>> [(set (reg:CC_NCV CC_REGNUM) >>>>>>>>> (compare:CC_NCV >>>>>>>>> (match_operand:DI 1 "register_operand" "r") >>>>>>>>> (match_operand:DI 2 "register_operand" "r"))) >>>>>>>>> (set (match_operand:DI 0 "register_operand" "=&r") >>>>>>>>> (minus:DI (match_dup 1) (match_dup 2)))] >>>>>>>>> "TARGET_32BIT" >>>>>>>>> "#" >>>>>>>>> "&& reload_completed" >>>>>>>>> [(parallel [(set (reg:CC CC_REGNUM) >>>>>>>>> (compare:CC (match_dup 1) (match_dup 2))) >>>>>>>>> (set (match_dup 0) (minus:SI (match_dup 1) (match_dup >>>>>>>>> 2)))]) >>>>>>>>> (parallel [(set (reg:CC_C CC_REGNUM) >>>>>>>>> (compare:CC_C >>>>>>>>> (zero_extend:DI (match_dup 4)) >>>>>>>>> (plus:DI (zero_extend:DI (match_dup 5)) >>>>>>>>> (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) >>>>>>>>> (set (match_dup 3) >>>>>>>>> (minus:SI (minus:SI (match_dup 4) (match_dup 5)) >>>>>>>>> (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] >>>>>>>>> >>>>>>>>> >>>>>>>>> This pattern is now no-longer self consistent in that before the >>>>>>>>> split >>>>>>>>> the overall result for the condition register is in mode CC_NCV, but >>>>>>>>> afterwards it is just CC_C. >>>>>>>>> >>>>>>>>> I think CC_NCV is correct mode (the N, C and V bits all correctly >>>>>>>>> reflect the result of the 64-bit comparison), but that then >>>>>>>>> implies that >>>>>>>>> the cc mode of subsi3_carryin_compare is incorrect as well and >>>>>>>>> should in >>>>>>>>> fact also be CC_NCV. Thinking about this pattern, I'm inclined to >>>>>>>>> agree >>>>>>>>> that CC_NCV is the correct mode for this operation >>>>>>>>> >>>>>>>>> I'm not sure if there are other consequences that will fall out from >>>>>>>>> fixing this (it's possible that we might need a change to >>>>>>>>> select_cc_mode >>>>>>>>> as well). >>>>>>>>> >>>>>>>> Yes, this is still a bit awkward... >>>>>>>> >>>>>>>> The N and V bit will be the correct result for the subdi3_compare1 >>>>>>>> a 64-bit comparison, but zero_extend:DI (match_dup 4) (plus:DI ...) >>>>>>>> only gets the C bit correct, the expression for N and V is a different >>>>>>>> one. >>>>>>>> >>>>>>>> It probably works, because the subsi3_carryin_compare instruction sets >>>>>>>> more CC bits than the pattern does explicitly specify the value. >>>>>>>> We know the subsi3_carryin_compare also computes the NV bits, but >>>>>>>> it is >>>>>>>> hard to write down the correct rtl expression for it. >>>>>>>> >>>>>>>> In theory the pattern should describe everything correctly, >>>>>>>> maybe, like: >>>>>>>> >>>>>>>> set (reg:CC_C CC_REGNUM) >>>>>>>> (compare:CC_C >>>>>>>> (zero_extend:DI (match_dup 4)) >>>>>>>> (plus:DI (zero_extend:DI (match_dup 5)) >>>>>>>> (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) >>>>>>>> set (reg:CC_NV CC_REGNUM) >>>>>>>> (compare:CC_NV >>>>>>>> (match_dup 4)) >>>>>>>> (plus:SI (match_dup 5) (ltu:SI (reg:CC_C CC_REGNUM) >>>>>>>> (const_int 0))) >>>>>>>> set (match_dup 3) >>>>>>>> (minus:SI (minus:SI (match_dup 4) (match_dup 5)) >>>>>>>> (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))) >>>>>>>> >>>>>>>> >>>>>>>> But I doubt that will work to set CC_REGNUM with two different modes >>>>>>>> in parallel? >>>>>>>> >>>>>>>> Another idea would be to invent a CC_CNV_NOOV mode, that implicitly >>>>>>>> defines C from the DImode result, and NV from the SImode result, >>>>>>>> similar to the CC_NOOVmode, that also leaves something open what >>>>>>>> bits it really defines? >>>>>>>> >>>>>>>> >>>>>>>> What do you think? >>>>>>>> >>>>>>>> >>>>>>>> Thanks >>>>>>>> Bernd. >>>>>>> I think maybe the right solution is to invent a new CCmode >>>>>>> that defines C as if the comparison is done in DImode >>>>>>> but N and V as if the comparison is done in SImode. >>>>>>> >>>>>>> I thought maybe I would call it CC_NCV_CIC (CIC = Carry-In-Compare), >>>>>>> furthermore I think the CC_NOOV should be renamed to CC_NZ (because >>>>>>> only N and Z are set correctly), but in a different patch of course. >>>>>>> >>>>>>> Attached is a new version that implements the new CCmode. >>>>>>> >>>>>>> How do you like this new version? >>>>>>> >>>>>>> It seems to be able to build a cross-compiler at least. >>>>>>> >>>>>>> I will start a new bootstrap with this new patch, but that can take >>>>>>> some >>>>>>> time until I have definitive results. >>>>>>> >>>>>>> Is there still a chance that it can go into gcc-7 or should it wait >>>>>>> for the next stage1? >>>>>>> >>>>>>> Thanks >>>>>>> Bernd. >>>>>> >>>>>> I thought I should also look at where the subdi_compare1 amd the >>>>>> negdi2_compare patterns are used, and look if the caller is fine with >>>>>> not having all CC bits available. >>>>>> >>>>>> And indeed usubv<mode>4 turns out to be questionabe, because it >>>>>> emits gen_sub<mode>3_compare1 and uses arm_gen_unlikely_cbranch (LTU, >>>>>> CCmode) which is inconsistent when subdi3_compare1 no longer uses >>>>>> CCmode. >>>>>> >>>>>> To correct this, the branch should use CC_Cmode which is always defined. >>>>>> >>>>>> So I tried to test this pattern, with the following test programs, >>>>>> and found that the code actually improves when the branch uses CC_Cmode >>>>>> instead of CCmode, both for SImode and DImode data, which was a bit >>>>>> surprising. >>>>>> >>>>>> I used this test program to see how the usubv<mode>4 pattern works: >>>>>> >>>>>> cat test.c (DImode) >>>>>> unsigned long long x, y, z; >>>>>> int b; >>>>>> void test() >>>>>> { >>>>>> b = __builtin_sub_overflow (y,z, &x); >>>>>> } >>>>>> >>>>>> >>>>>> unpatched code used 8 byte more stack than patched, >>>>>> because the DImode subtraction is effectively done twice. >>>>>> >>>>>> cat test1.c (SImode) >>>>>> unsigned long x, y, z; >>>>>> int b; >>>>>> void test() >>>>>> { >>>>>> b = __builtin_sub_overflow (y,z, &x); >>>>>> } >>>>>> >>>>>> which generates (unpatched): >>>>>> cmp r3, r0 >>>>>> sub ip, r3, r0 >>>>>> >>>>>> instead of expected (patched): >>>>>> subs r3, r3, r2 >>>>>> >>>>>> >>>>>> The condition is extracted by ifconversion and/or combine >>>>>> and complicates the resulting code instead of simplifying. >>>>>> >>>>>> I think this happens only when the branch and the subsi/di3_compare1 >>>>>> is using the same CC mode. >>>>>> >>>>>> That does not happen when the CC modes disagree, as with the >>>>>> proposed patch. All other uses of the pattern are already using >>>>>> CC_Cmode or CC_Vmode in the branch, and these do not change. >>>>>> >>>>>> Attached is an updated version of the patch, that happens to >>>>>> improve the code generation of the usubsi4 and usubdi4 pattern, >>>>>> as a side effect. >>>>>> >>>>>> >>>>>> Bootstrapped and reg-tested on arm-linux-gnueabihf. >>>>>> Is it OK for trunk? >>>>> >>>>> I'm very sorry it has taken so long to review. >>>>> I've been ramping up on the context recently now so I'll try to move >>>>> this along... >>>>> >>>>> This patch looks mostly ok to me from reading the patterns and the >>>>> discussion around it. >>>>> I have one concern: >>>>> >>>>> >>>>> (define_insn_and_split "negdi2_compare" >>>>> - [(set (reg:CC CC_REGNUM) >>>>> - (compare:CC >>>>> + [(set (reg:CC_NCV CC_REGNUM) >>>>> + (compare:CC_NCV >>>>> (const_int 0) >>>>> (match_operand:DI 1 "register_operand" "0,r"))) >>>>> (set (match_operand:DI 0 "register_operand" "=r,&r") >>>>> @@ -4647,8 +4650,12 @@ >>>>> (compare:CC (const_int 0) (match_dup 1))) >>>>> (set (match_dup 0) (minus:SI (const_int 0) >>>>> (match_dup 1)))]) >>>>> - (parallel [(set (reg:CC CC_REGNUM) >>>>> - (compare:CC (const_int 0) (match_dup 3))) >>>>> + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) >>>>> + (compare:CC_NCV_CIC >>>>> + (const_int 0) >>>>> + (plus:DI >>>>> + (zero_extend:DI (match_dup 3)) >>>>> + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) >>>>> (set (match_dup 2) >>>>> (minus:SI >>>>> (minus:SI (const_int 0) (match_dup 3)) >>>>> >>>>> >>>>> I was somewhat concerned with having the first operand of the COMPARE >>>>> being a const_int 0 and the second being >>>>> a complex expression as the RTL canonicalization rules usually require >>>>> the complex operand going first if possible. >>>>> Reading the RTL rules in rtl.texi I see it says this: >>>>> "If one of the operands is a constant, it should be placed in the >>>>> second operand and the comparison code adjusted as appropriate." >>>>> So it seems that the pre-existing pattern that puts const_int 0 as the >>>>> first operand already breaks that rule. >>>>> I think we should fix that and update the use of condition code to a >>>>> GEU rather than LTU as well. >>>>> >>>> >>> >>> Well, the sentence before that one is even more explicit: >>> >>> "Normally, @var{x} and @var{y} must have the same mode. Otherwise, >>> @code{compare} is valid only if the mode of @var{x} is in class >>> @code{MODE_INT} and @var{y} is a @code{const_int} or >>> @code{const_double} with mode @code{VOIDmode}." >>> >>> So because the const_int 0 has VOIDmode the comparison is done >>> in y-mode not x-mode. >>> >>> But unfortunately I see no way how to accomplish this, >>> because this assumes that the compare can be easily swapped >>> if the conditional instruction just uses one of GT/GE/LE/LT >>> or GTU/GEU/LEU/LTU. But that is only the case for plain CCmode. >>> >>> And in this example we ask for "overflow", but while 0-X can >>> overflow X-0 simply can't. And moreover there are non-symmetric >>> modes like CC_NCVmode which only support LT/GE/LTU/GEU but not >>> the swapped conditions GT/LE/GTU/LEU. >>> >>> I think the only solution would be to adjust the spec to >>> reflect the implementation: >>> >>> Index: rtl.texi >>> =================================================================== >>> --- rtl.texi (revision 251752) >>> +++ rtl.texi (working copy) >>> @@ -2252,6 +2252,13 @@ >>> If one of the operands is a constant, it should be placed in the >>> second operand and the comparison code adjusted as appropriate. >>> >>> +There may be exceptions from this rule if the mode @var{m} carries >>> +not enough information for the swapped comparison operator, or >> >> There may be exceptions _to_ ... if mode @var{m} does not carry enough... >> >>> +if we ask for overflow from the subtraction. >> >> Aren't we really trying to 'detect overflow' rather than 'ask' for it? >> >>> That means, while >>> +0-X may overfow X-0 can never overflow. Under these conditions >>> +a compare may have the constant expression at the left side. >> >> In these circumstances the constant will be in the first operand . >> >> (left and right don't really make sense for RTL). >>> +Examples are the ARM negdi2_compare pattern and similar. >>> + >>> A @code{compare} specifying two @code{VOIDmode} constants is not valid >>> since there is no way to know in what mode the comparison is to be >>> performed; the comparison must either be folded during the compilation >>> >>> >>> >>> Please advise. >>> >>> Thanks >>> Bernd. >>> >>> >>>> >>>> Hmmm... >>>> >>>> I think the compare is not a commutative operation, and swapping >>>> the arguments will imply a different value in the flags. >>>> >>>> So if I write >>>> (set (reg:CC_NCV CC_REGNUM) >>>> (compare:CC_NCV >>>> (const_int 0) >>>> (reg:DI 123))) >>>> >>>> I have C,N,V set to the result of (0 - r123), C = usable for LTU or GEU, >>>> N,V = usable for LT, GE >>>> >>>> But if I write >>>> (set (reg:CC_NCV CC_REGNUM) >>>> (compare:CC_NCV >>>> (reg:DI 123) >>>> (const_int 0))) >>>> >>>> I have C,N,V set to the result of (r123 - 0), but the expansion stays >>>> the same and the actual value in the flags is defined by the expansion. >>>> Of course there exists probably no matching expansion for that. >>>> >>>> Note that both LTU in the above hunk are in a parallel-stmt and operate >>>> on the flags from the previous pattern, so changing these to GEU >>>> will probably be wrong. >>>> >>>> Both (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)) in the negdi2_compare >>>> use the flags from the previous (set (reg:CC CC_REGNUM) (compare:CC >>>> (const_int 0) (match_dup 1)). >>>> >>>> One use of the resulting flags (I know of) is in negvdi3 where we >>>> have: >>>> >>>> emit_insn (gen_negdi2_compare (operands[0], operands[1])); >>>> arm_gen_unlikely_cbranch (NE, CC_Vmode, operands[2]); >>>> >>>> I think only 0-x can overflow while x-0 can never overflow. >>>> >>>> Of course the CC_NCV_CIC mode bends the definition of the RTL compare >>>> a lot and I guess if this pattern is created by a splitter, this can >>>> only be expanded by an exactly matching pattern, there is (hopefully) >>>> no way how combine could mess with this pattern due to the exotic >>>> CCmode. So while I think it would work to swap only the notation of >>>> all CC_NCV_CIC patterns, _without_ changing the assembler-parts and the >>>> consuming statements, that would make it quite hard to follow for the >>>> human reader at least. >>>> >>>> What do you think? >>>> >>>> >>>> Bernd. >> > > Attached is the patch with an update to the rtl.texi documentation. > The code does not change, so I did no new bootstrap. > > > Is it OK for trunk? > > > Thanks > Bernd. > > > patch-pr77308-5.diff > > > 2017-09-06 Bernd Edlinger <bernd.edlin...@hotmail.de> > > PR target/77308 > * doc/rtl.texi: Update documentation. > * config/arm/arm-modes.def (CC_NCV_CIC): New mode. > * config/arm/arm.md (adddi3_compareV, *addsi3_compareV_upper, > adddi3_compareC, *addsi3_compareC_upper, subdi3_compare1, > subsi3_carryin_compare, subsi3_carryin_compare_const, > negdi2_compare, *negsi2_carryin_compare, > *arm_cmpdi_insn): Fix the CC reg dataflow. > (usubv<mode>4): Use CC_Cmode for the branch. > > Index: gcc/config/arm/arm-modes.def > =================================================================== > --- gcc/config/arm/arm-modes.def (revision 244439) > +++ gcc/config/arm/arm-modes.def (working copy) > @@ -38,6 +38,8 @@ > (used for DImode unsigned comparisons). > CC_NCVmode should be used if only the N, C, and V flags are correct > (used for DImode signed comparisons). > + CC_NCV_CICmode defines N and V in SImode and C in DImode > + (used for carryin_compare patterns). > CCmode should be used otherwise. */ > > CC_MODE (CC_NOOV); > @@ -44,6 +46,7 @@ > CC_MODE (CC_Z); > CC_MODE (CC_CZ); > CC_MODE (CC_NCV); > +CC_MODE (CC_NCV_CIC); > CC_MODE (CC_SWP); > CC_MODE (CCFP); > CC_MODE (CCFPE); > Index: gcc/config/arm/arm.md > =================================================================== > --- gcc/config/arm/arm.md (revision 244439) > +++ gcc/config/arm/arm.md (working copy) > @@ -669,17 +669,15 @@ > (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))]) > (parallel [(set (reg:CC_V CC_REGNUM) > (ne:CC_V > - (plus:DI (plus:DI > - (sign_extend:DI (match_dup 4)) > - (sign_extend:DI (match_dup 5))) > - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) > - (plus:DI (sign_extend:DI > - (plus:SI (match_dup 4) (match_dup 5))) > - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > - (set (match_dup 3) (plus:SI (plus:SI > - (match_dup 4) (match_dup 5)) > - (ltu:SI (reg:CC_C CC_REGNUM) > - (const_int 0))))])] > + (plus:DI (plus:DI (sign_extend:DI (match_dup 4)) > + (sign_extend:DI (match_dup 5))) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) > + (sign_extend:DI > + (plus:SI (plus:SI (match_dup 4) (match_dup 5)) > + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))))) > + (set (match_dup 3) (plus:SI (plus:SI (match_dup 4) (match_dup 5)) > + (ltu:SI (reg:CC_C CC_REGNUM) > + (const_int 0))))])] > " > { > operands[3] = gen_highpart (SImode, operands[0]); > @@ -713,13 +711,13 @@ > [(set (reg:CC_V CC_REGNUM) > (ne:CC_V > (plus:DI > - (plus:DI > - (sign_extend:DI (match_operand:SI 1 "register_operand" "r")) > - (sign_extend:DI (match_operand:SI 2 "register_operand" "r"))) > - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) > - (plus:DI (sign_extend:DI > - (plus:SI (match_dup 1) (match_dup 2))) > - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > + (plus:DI > + (sign_extend:DI (match_operand:SI 1 "register_operand" "r")) > + (sign_extend:DI (match_operand:SI 2 "register_operand" "r"))) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) > + (sign_extend:DI (plus:SI (plus:SI (match_dup 1) (match_dup 2)) > + (ltu:SI (reg:CC_C CC_REGNUM) > + (const_int 0)))))) > (set (match_operand:SI 0 "register_operand" "=r") > (plus:SI > (plus:SI (match_dup 1) (match_dup 2)) > @@ -748,17 +746,15 @@ > (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))]) > (parallel [(set (reg:CC_C CC_REGNUM) > (ne:CC_C > - (plus:DI (plus:DI > - (zero_extend:DI (match_dup 4)) > - (zero_extend:DI (match_dup 5))) > - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) > - (plus:DI (zero_extend:DI > - (plus:SI (match_dup 4) (match_dup 5))) > - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > - (set (match_dup 3) (plus:SI > - (plus:SI (match_dup 4) (match_dup 5)) > - (ltu:SI (reg:CC_C CC_REGNUM) > - (const_int 0))))])] > + (plus:DI (plus:DI (zero_extend:DI (match_dup 4)) > + (zero_extend:DI (match_dup 5))) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) > + (zero_extend:DI > + (plus:SI (plus:SI (match_dup 4) (match_dup 5)) > + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))))) > + (set (match_dup 3) (plus:SI (plus:SI (match_dup 4) (match_dup 5)) > + (ltu:SI (reg:CC_C CC_REGNUM) > + (const_int 0))))])] > " > { > operands[3] = gen_highpart (SImode, operands[0]); > @@ -777,17 +773,16 @@ > [(set (reg:CC_C CC_REGNUM) > (ne:CC_C > (plus:DI > - (plus:DI > - (zero_extend:DI (match_operand:SI 1 "register_operand" "r")) > - (zero_extend:DI (match_operand:SI 2 "register_operand" "r"))) > - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) > - (plus:DI (zero_extend:DI > - (plus:SI (match_dup 1) (match_dup 2))) > - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > + (plus:DI > + (zero_extend:DI (match_operand:SI 1 "register_operand" "r")) > + (zero_extend:DI (match_operand:SI 2 "register_operand" "r"))) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) > + (zero_extend:DI > + (plus:SI (plus:SI (match_dup 1) (match_dup 2)) > + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))))) > (set (match_operand:SI 0 "register_operand" "=r") > - (plus:SI > - (plus:SI (match_dup 1) (match_dup 2)) > - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] > + (plus:SI (plus:SI (match_dup 1) (match_dup 2)) > + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] > "TARGET_32BIT" > "adcs%?\\t%0, %1, %2" > [(set_attr "conds" "set") > @@ -1080,14 +1075,14 @@ > "TARGET_32BIT" > { > emit_insn (gen_sub<mode>3_compare1 (operands[0], operands[1], > operands[2])); > - arm_gen_unlikely_cbranch (LTU, CCmode, operands[3]); > + arm_gen_unlikely_cbranch (EQ, CC_Cmode, operands[3]); > > DONE; > }) > > (define_insn_and_split "subdi3_compare1" > - [(set (reg:CC CC_REGNUM) > - (compare:CC > + [(set (reg:CC_NCV CC_REGNUM) > + (compare:CC_NCV > (match_operand:DI 1 "register_operand" "r") > (match_operand:DI 2 "register_operand" "r"))) > (set (match_operand:DI 0 "register_operand" "=&r") > @@ -1098,10 +1093,14 @@ > [(parallel [(set (reg:CC CC_REGNUM) > (compare:CC (match_dup 1) (match_dup 2))) > (set (match_dup 0) (minus:SI (match_dup 1) (match_dup 2)))]) > - (parallel [(set (reg:CC CC_REGNUM) > - (compare:CC (match_dup 4) (match_dup 5))) > - (set (match_dup 3) (minus:SI (minus:SI (match_dup 4) (match_dup 5)) > - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] > + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) > + (compare:CC_NCV_CIC > + (zero_extend:DI (match_dup 4)) > + (plus:DI (zero_extend:DI (match_dup 5)) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > + (set (match_dup 3) > + (minus:SI (minus:SI (match_dup 4) (match_dup 5)) > + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] > { > operands[3] = gen_highpart (SImode, operands[0]); > operands[0] = gen_lowpart (SImode, operands[0]); > @@ -1157,13 +1156,15 @@ > ) > > (define_insn "*subsi3_carryin_compare" > - [(set (reg:CC CC_REGNUM) > - (compare:CC (match_operand:SI 1 "s_register_operand" "r") > - (match_operand:SI 2 "s_register_operand" "r"))) > + [(set (reg:CC_NCV_CIC CC_REGNUM) > + (compare:CC_NCV_CIC > + (zero_extend:DI (match_operand:SI 1 "s_register_operand" "r")) > + (plus:DI > + (zero_extend:DI (match_operand:SI 2 "s_register_operand" "r")) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > (set (match_operand:SI 0 "s_register_operand" "=r") > - (minus:SI (minus:SI (match_dup 1) > - (match_dup 2)) > - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] > + (minus:SI (minus:SI (match_dup 1) (match_dup 2)) > + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] > "TARGET_32BIT" > "sbcs\\t%0, %1, %2" > [(set_attr "conds" "set") > @@ -1171,13 +1172,15 @@ > ) > > (define_insn "*subsi3_carryin_compare_const" > - [(set (reg:CC CC_REGNUM) > - (compare:CC (match_operand:SI 1 "reg_or_int_operand" "r") > - (match_operand:SI 2 "arm_not_operand" "K"))) > + [(set (reg:CC_NCV_CIC CC_REGNUM) > + (compare:CC_NCV_CIC > + (zero_extend:DI (match_operand:SI 1 "reg_or_int_operand" "r")) > + (plus:DI > + (zero_extend:DI (match_operand:SI 2 "arm_not_operand" "K")) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > (set (match_operand:SI 0 "s_register_operand" "=r") > - (minus:SI (plus:SI (match_dup 1) > - (match_dup 2)) > - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] > + (minus:SI (plus:SI (match_dup 1) (match_dup 2)) > + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] > "TARGET_32BIT" > "sbcs\\t%0, %1, #%B2" > [(set_attr "conds" "set") > @@ -4634,8 +4637,8 @@ > > > (define_insn_and_split "negdi2_compare" > - [(set (reg:CC CC_REGNUM) > - (compare:CC > + [(set (reg:CC_NCV CC_REGNUM) > + (compare:CC_NCV > (const_int 0) > (match_operand:DI 1 "register_operand" "0,r"))) > (set (match_operand:DI 0 "register_operand" "=r,&r") > @@ -4647,8 +4650,12 @@ > (compare:CC (const_int 0) (match_dup 1))) > (set (match_dup 0) (minus:SI (const_int 0) > (match_dup 1)))]) > - (parallel [(set (reg:CC CC_REGNUM) > - (compare:CC (const_int 0) (match_dup 3))) > + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) > + (compare:CC_NCV_CIC > + (const_int 0) > + (plus:DI > + (zero_extend:DI (match_dup 3)) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > (set (match_dup 2) > (minus:SI > (minus:SI (const_int 0) (match_dup 3)) > @@ -4707,12 +4714,14 @@ > ) > > (define_insn "*negsi2_carryin_compare" > - [(set (reg:CC CC_REGNUM) > - (compare:CC (const_int 0) > - (match_operand:SI 1 "s_register_operand" "r"))) > + [(set (reg:CC_NCV_CIC CC_REGNUM) > + (compare:CC_NCV_CIC > + (const_int 0) > + (plus:DI > + (zero_extend:DI (match_operand:SI 1 "s_register_operand" "r")) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > (set (match_operand:SI 0 "s_register_operand" "=r") > - (minus:SI (minus:SI (const_int 0) > - (match_dup 1)) > + (minus:SI (minus:SI (const_int 0) (match_dup 1)) > (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] > "TARGET_ARM" > "rscs\\t%0, %1, #0" > @@ -7361,12 +7370,15 @@ > "#" ; "cmp\\t%Q0, %Q1\;sbcs\\t%2, %R0, %R1" > "&& reload_completed" > [(set (reg:CC CC_REGNUM) > - (compare:CC (match_dup 0) (match_dup 1))) > - (parallel [(set (reg:CC CC_REGNUM) > - (compare:CC (match_dup 3) (match_dup 4))) > - (set (match_dup 2) > - (minus:SI (match_dup 5) > - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] > + (compare:CC (match_dup 0) (match_dup 1))) > + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) > + (compare:CC_NCV_CIC > + (zero_extend:DI (match_dup 3)) > + (plus:DI (zero_extend:DI (match_dup 4)) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > + (set (match_dup 2) > + (minus:SI (match_dup 5) > + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] > { > operands[3] = gen_highpart (SImode, operands[0]); > operands[0] = gen_lowpart (SImode, operands[0]); > Index: gcc/config/arm/arm-modes.def > =================================================================== > --- gcc/config/arm/arm-modes.def (revision 251752) > +++ gcc/config/arm/arm-modes.def (working copy) > @@ -38,6 +38,8 @@ > (used for DImode unsigned comparisons). > CC_NCVmode should be used if only the N, C, and V flags are correct > (used for DImode signed comparisons). > + CC_NCV_CICmode defines N and V in SImode and C in DImode > + (used for carryin_compare patterns). > CCmode should be used otherwise. */ > > CC_MODE (CC_NOOV); > @@ -44,6 +46,7 @@ > CC_MODE (CC_Z); > CC_MODE (CC_CZ); > CC_MODE (CC_NCV); > +CC_MODE (CC_NCV_CIC); > CC_MODE (CC_SWP); > CC_MODE (CCFP); > CC_MODE (CCFPE); > Index: gcc/config/arm/arm.md > =================================================================== > --- gcc/config/arm/arm.md (revision 251752) > +++ gcc/config/arm/arm.md (working copy) > @@ -664,17 +664,15 @@ > (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))]) > (parallel [(set (reg:CC_V CC_REGNUM) > (ne:CC_V > - (plus:DI (plus:DI > - (sign_extend:DI (match_dup 4)) > - (sign_extend:DI (match_dup 5))) > - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) > - (plus:DI (sign_extend:DI > - (plus:SI (match_dup 4) (match_dup 5))) > - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > - (set (match_dup 3) (plus:SI (plus:SI > - (match_dup 4) (match_dup 5)) > - (ltu:SI (reg:CC_C CC_REGNUM) > - (const_int 0))))])] > + (plus:DI (plus:DI (sign_extend:DI (match_dup 4)) > + (sign_extend:DI (match_dup 5))) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) > + (sign_extend:DI > + (plus:SI (plus:SI (match_dup 4) (match_dup 5)) > + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))))) > + (set (match_dup 3) (plus:SI (plus:SI (match_dup 4) (match_dup 5)) > + (ltu:SI (reg:CC_C CC_REGNUM) > + (const_int 0))))])] > " > { > operands[3] = gen_highpart (SImode, operands[0]); > @@ -708,13 +706,13 @@ > [(set (reg:CC_V CC_REGNUM) > (ne:CC_V > (plus:DI > - (plus:DI > - (sign_extend:DI (match_operand:SI 1 "register_operand" "r")) > - (sign_extend:DI (match_operand:SI 2 "register_operand" "r"))) > - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) > - (plus:DI (sign_extend:DI > - (plus:SI (match_dup 1) (match_dup 2))) > - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > + (plus:DI > + (sign_extend:DI (match_operand:SI 1 "register_operand" "r")) > + (sign_extend:DI (match_operand:SI 2 "register_operand" "r"))) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) > + (sign_extend:DI (plus:SI (plus:SI (match_dup 1) (match_dup 2)) > + (ltu:SI (reg:CC_C CC_REGNUM) > + (const_int 0)))))) > (set (match_operand:SI 0 "register_operand" "=r") > (plus:SI > (plus:SI (match_dup 1) (match_dup 2)) > @@ -743,17 +741,15 @@ > (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))]) > (parallel [(set (reg:CC_C CC_REGNUM) > (ne:CC_C > - (plus:DI (plus:DI > - (zero_extend:DI (match_dup 4)) > - (zero_extend:DI (match_dup 5))) > - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) > - (plus:DI (zero_extend:DI > - (plus:SI (match_dup 4) (match_dup 5))) > - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > - (set (match_dup 3) (plus:SI > - (plus:SI (match_dup 4) (match_dup 5)) > - (ltu:SI (reg:CC_C CC_REGNUM) > - (const_int 0))))])] > + (plus:DI (plus:DI (zero_extend:DI (match_dup 4)) > + (zero_extend:DI (match_dup 5))) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) > + (zero_extend:DI > + (plus:SI (plus:SI (match_dup 4) (match_dup 5)) > + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))))) > + (set (match_dup 3) (plus:SI (plus:SI (match_dup 4) (match_dup 5)) > + (ltu:SI (reg:CC_C CC_REGNUM) > + (const_int 0))))])] > " > { > operands[3] = gen_highpart (SImode, operands[0]); > @@ -772,17 +768,16 @@ > [(set (reg:CC_C CC_REGNUM) > (ne:CC_C > (plus:DI > - (plus:DI > - (zero_extend:DI (match_operand:SI 1 "register_operand" "r")) > - (zero_extend:DI (match_operand:SI 2 "register_operand" "r"))) > - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) > - (plus:DI (zero_extend:DI > - (plus:SI (match_dup 1) (match_dup 2))) > - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > + (plus:DI > + (zero_extend:DI (match_operand:SI 1 "register_operand" "r")) > + (zero_extend:DI (match_operand:SI 2 "register_operand" "r"))) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) > + (zero_extend:DI > + (plus:SI (plus:SI (match_dup 1) (match_dup 2)) > + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))))) > (set (match_operand:SI 0 "register_operand" "=r") > - (plus:SI > - (plus:SI (match_dup 1) (match_dup 2)) > - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] > + (plus:SI (plus:SI (match_dup 1) (match_dup 2)) > + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] > "TARGET_32BIT" > "adcs%?\\t%0, %1, %2" > [(set_attr "conds" "set") > @@ -1075,14 +1070,14 @@ > "TARGET_32BIT" > { > emit_insn (gen_sub<mode>3_compare1 (operands[0], operands[1], > operands[2])); > - arm_gen_unlikely_cbranch (LTU, CCmode, operands[3]); > + arm_gen_unlikely_cbranch (EQ, CC_Cmode, operands[3]); > > DONE; > }) > > (define_insn_and_split "subdi3_compare1" > - [(set (reg:CC CC_REGNUM) > - (compare:CC > + [(set (reg:CC_NCV CC_REGNUM) > + (compare:CC_NCV > (match_operand:DI 1 "register_operand" "r") > (match_operand:DI 2 "register_operand" "r"))) > (set (match_operand:DI 0 "register_operand" "=&r") > @@ -1093,10 +1088,14 @@ > [(parallel [(set (reg:CC CC_REGNUM) > (compare:CC (match_dup 1) (match_dup 2))) > (set (match_dup 0) (minus:SI (match_dup 1) (match_dup 2)))]) > - (parallel [(set (reg:CC CC_REGNUM) > - (compare:CC (match_dup 4) (match_dup 5))) > - (set (match_dup 3) (minus:SI (minus:SI (match_dup 4) (match_dup 5)) > - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] > + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) > + (compare:CC_NCV_CIC > + (zero_extend:DI (match_dup 4)) > + (plus:DI (zero_extend:DI (match_dup 5)) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > + (set (match_dup 3) > + (minus:SI (minus:SI (match_dup 4) (match_dup 5)) > + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] > { > operands[3] = gen_highpart (SImode, operands[0]); > operands[0] = gen_lowpart (SImode, operands[0]); > @@ -1152,13 +1151,15 @@ > ) > > (define_insn "*subsi3_carryin_compare" > - [(set (reg:CC CC_REGNUM) > - (compare:CC (match_operand:SI 1 "s_register_operand" "r") > - (match_operand:SI 2 "s_register_operand" "r"))) > + [(set (reg:CC_NCV_CIC CC_REGNUM) > + (compare:CC_NCV_CIC > + (zero_extend:DI (match_operand:SI 1 "s_register_operand" "r")) > + (plus:DI > + (zero_extend:DI (match_operand:SI 2 "s_register_operand" "r")) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > (set (match_operand:SI 0 "s_register_operand" "=r") > - (minus:SI (minus:SI (match_dup 1) > - (match_dup 2)) > - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] > + (minus:SI (minus:SI (match_dup 1) (match_dup 2)) > + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] > "TARGET_32BIT" > "sbcs\\t%0, %1, %2" > [(set_attr "conds" "set") > @@ -1166,13 +1167,15 @@ > ) > > (define_insn "*subsi3_carryin_compare_const" > - [(set (reg:CC CC_REGNUM) > - (compare:CC (match_operand:SI 1 "reg_or_int_operand" "r") > - (match_operand:SI 2 "arm_not_operand" "K"))) > + [(set (reg:CC_NCV_CIC CC_REGNUM) > + (compare:CC_NCV_CIC > + (zero_extend:DI (match_operand:SI 1 "reg_or_int_operand" "r")) > + (plus:DI > + (zero_extend:DI (match_operand:SI 2 "arm_not_operand" "K")) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > (set (match_operand:SI 0 "s_register_operand" "=r") > - (minus:SI (plus:SI (match_dup 1) > - (match_dup 2)) > - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] > + (minus:SI (plus:SI (match_dup 1) (match_dup 2)) > + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] > "TARGET_32BIT" > "sbcs\\t%0, %1, #%B2" > [(set_attr "conds" "set") > @@ -4686,8 +4689,8 @@ > > > (define_insn_and_split "negdi2_compare" > - [(set (reg:CC CC_REGNUM) > - (compare:CC > + [(set (reg:CC_NCV CC_REGNUM) > + (compare:CC_NCV > (const_int 0) > (match_operand:DI 1 "register_operand" "0,r"))) > (set (match_operand:DI 0 "register_operand" "=r,&r") > @@ -4699,8 +4702,12 @@ > (compare:CC (const_int 0) (match_dup 1))) > (set (match_dup 0) (minus:SI (const_int 0) > (match_dup 1)))]) > - (parallel [(set (reg:CC CC_REGNUM) > - (compare:CC (const_int 0) (match_dup 3))) > + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) > + (compare:CC_NCV_CIC > + (const_int 0) > + (plus:DI > + (zero_extend:DI (match_dup 3)) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > (set (match_dup 2) > (minus:SI > (minus:SI (const_int 0) (match_dup 3)) > @@ -4759,12 +4766,14 @@ > ) > > (define_insn "*negsi2_carryin_compare" > - [(set (reg:CC CC_REGNUM) > - (compare:CC (const_int 0) > - (match_operand:SI 1 "s_register_operand" "r"))) > + [(set (reg:CC_NCV_CIC CC_REGNUM) > + (compare:CC_NCV_CIC > + (const_int 0) > + (plus:DI > + (zero_extend:DI (match_operand:SI 1 "s_register_operand" "r")) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > (set (match_operand:SI 0 "s_register_operand" "=r") > - (minus:SI (minus:SI (const_int 0) > - (match_dup 1)) > + (minus:SI (minus:SI (const_int 0) (match_dup 1)) > (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] > "TARGET_ARM" > "rscs\\t%0, %1, #0" > @@ -7438,12 +7447,15 @@ > "#" ; "cmp\\t%Q0, %Q1\;sbcs\\t%2, %R0, %R1" > "&& reload_completed" > [(set (reg:CC CC_REGNUM) > - (compare:CC (match_dup 0) (match_dup 1))) > - (parallel [(set (reg:CC CC_REGNUM) > - (compare:CC (match_dup 3) (match_dup 4))) > - (set (match_dup 2) > - (minus:SI (match_dup 5) > - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] > + (compare:CC (match_dup 0) (match_dup 1))) > + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) > + (compare:CC_NCV_CIC > + (zero_extend:DI (match_dup 3)) > + (plus:DI (zero_extend:DI (match_dup 4)) > + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > + (set (match_dup 2) > + (minus:SI (match_dup 5) > + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] > { > operands[3] = gen_highpart (SImode, operands[0]); > operands[0] = gen_lowpart (SImode, operands[0]); > Index: gcc/doc/rtl.texi > =================================================================== > --- gcc/doc/rtl.texi (revision 251752) > +++ gcc/doc/rtl.texi (working copy) > @@ -2252,6 +2252,13 @@ > If one of the operands is a constant, it should be placed in the > second operand and the comparison code adjusted as appropriate. > > +There may be exceptions to this rule if the mode @var{m} does not > +carry enough information for the swapped comparison operator, or > +if we try to detect overflow from the subtraction. That means, while > +0-X may overfow X-0 can never overflow. Under these conditions > +a compare may have the constant expression at the first operand. > +Examples are the ARM negdi2_compare pattern and similar. > + > A @code{compare} specifying two @code{VOIDmode} constants is not valid > since there is no way to know in what mode the comparison is to be > performed; the comparison must either be folded during the compilation >
Er, hold on. Comparisons don't 'overflow'. They compare two values and tell you something about their relationship. If A cmp B doesn't tell me the same basic things as B swapped(cmp) A, then the world will fall apart big time. So your documentation patch can't be right. R.