On 01/13/17 17:10, Bernd Edlinger wrote: > On 01/13/17 14:50, Richard Earnshaw (lists) wrote: >> On 18/12/16 12:58, Bernd Edlinger wrote: >>> Hi, >>> >>> this is related to PR77308, the follow-up patch will depend on this one. >>> >>> When trying the split the *arm_cmpdi_insn and *arm_cmpdi_unsigned >>> before reload, a mis-compilation in libgcc function __gnu_satfractdasq >>> was discovered, see [1] for more details. >>> >>> The reason seems to be that when the *arm_cmpdi_insn is directly >>> followed by a *arm_cmpdi_unsigned instruction, both are split >>> up into this: >>> >>> [(set (reg:CC CC_REGNUM) >>> (compare:CC (match_dup 0) (match_dup 1))) >>> (parallel [(set (reg:CC CC_REGNUM) >>> (compare:CC (match_dup 3) (match_dup 4))) >>> (set (match_dup 2) >>> (minus:SI (match_dup 5) >>> (ltu:SI (reg:CC_C CC_REGNUM) (const_int >>> 0))))])] >>> >>> [(set (reg:CC CC_REGNUM) >>> (compare:CC (match_dup 2) (match_dup 3))) >>> (cond_exec (eq:SI (reg:CC CC_REGNUM) (const_int 0)) >>> (set (reg:CC CC_REGNUM) >>> (compare:CC (match_dup 0) (match_dup 1))))] >>> >>> The problem is that the reg:CC from the *subsi3_carryin_compare >>> is not mentioning that the reg:CC is also dependent on the reg:CC >>> from before. Therefore the *arm_cmpsi_insn appears to be >>> redundant and thus got removed, because the data values are identical. >>> >>> I think that applies to a number of similar pattern where data >>> flow is happening through the CC reg. >>> >>> So this is a kind of correctness issue, and should be fixed >>> independently from the optimization issue PR77308. >>> >>> Therefore I think the patterns need to specify the true >>> value that will be in the CC reg, in order for cse to >>> know what the instructions are really doing. >>> >>> >>> Bootstrapped and reg-tested on arm-linux-gnueabihf. >>> Is it OK for trunk? >>> >> >> I agree you've found a valid problem here, but I have some issues with >> the patch itself. >> >> >> (define_insn_and_split "subdi3_compare1" >> [(set (reg:CC_NCV CC_REGNUM) >> (compare:CC_NCV >> (match_operand:DI 1 "register_operand" "r") >> (match_operand:DI 2 "register_operand" "r"))) >> (set (match_operand:DI 0 "register_operand" "=&r") >> (minus:DI (match_dup 1) (match_dup 2)))] >> "TARGET_32BIT" >> "#" >> "&& reload_completed" >> [(parallel [(set (reg:CC CC_REGNUM) >> (compare:CC (match_dup 1) (match_dup 2))) >> (set (match_dup 0) (minus:SI (match_dup 1) (match_dup 2)))]) >> (parallel [(set (reg:CC_C CC_REGNUM) >> (compare:CC_C >> (zero_extend:DI (match_dup 4)) >> (plus:DI (zero_extend:DI (match_dup 5)) >> (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) >> (set (match_dup 3) >> (minus:SI (minus:SI (match_dup 4) (match_dup 5)) >> (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] >> >> >> This pattern is now no-longer self consistent in that before the split >> the overall result for the condition register is in mode CC_NCV, but >> afterwards it is just CC_C. >> >> I think CC_NCV is correct mode (the N, C and V bits all correctly >> reflect the result of the 64-bit comparison), but that then implies that >> the cc mode of subsi3_carryin_compare is incorrect as well and should in >> fact also be CC_NCV. Thinking about this pattern, I'm inclined to agree >> that CC_NCV is the correct mode for this operation >> >> I'm not sure if there are other consequences that will fall out from >> fixing this (it's possible that we might need a change to select_cc_mode >> as well). >> > > Yes, this is still a bit awkward... > > The N and V bit will be the correct result for the subdi3_compare1 > a 64-bit comparison, but zero_extend:DI (match_dup 4) (plus:DI ...) > only gets the C bit correct, the expression for N and V is a different > one. > > It probably works, because the subsi3_carryin_compare instruction sets > more CC bits than the pattern does explicitly specify the value. > We know the subsi3_carryin_compare also computes the NV bits, but it is > hard to write down the correct rtl expression for it. > > In theory the pattern should describe everything correctly, > maybe, like: > > set (reg:CC_C CC_REGNUM) > (compare:CC_C > (zero_extend:DI (match_dup 4)) > (plus:DI (zero_extend:DI (match_dup 5)) > (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) > set (reg:CC_NV CC_REGNUM) > (compare:CC_NV > (match_dup 4)) > (plus:SI (match_dup 5) (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))) > set (match_dup 3) > (minus:SI (minus:SI (match_dup 4) (match_dup 5)) > (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))) > > > But I doubt that will work to set CC_REGNUM with two different modes > in parallel? > > Another idea would be to invent a CC_CNV_NOOV mode, that implicitly > defines C from the DImode result, and NV from the SImode result, > similar to the CC_NOOVmode, that also leaves something open what > bits it really defines? > > > What do you think? > > > Thanks > Bernd.
I think maybe the right solution is to invent a new CCmode that defines C as if the comparison is done in DImode but N and V as if the comparison is done in SImode. I thought maybe I would call it CC_NCV_CIC (CIC = Carry-In-Compare), furthermore I think the CC_NOOV should be renamed to CC_NZ (because only N and Z are set correctly), but in a different patch of course. Attached is a new version that implements the new CCmode. How do you like this new version? It seems to be able to build a cross-compiler at least. I will start a new bootstrap with this new patch, but that can take some time until I have definitive results. Is there still a chance that it can go into gcc-7 or should it wait for the next stage1? Thanks Bernd.
2016-01-13 Bernd Edlinger <bernd.edlin...@hotmail.de> PR target/77308 * config/arm/arm-modes.def (CC_NCV_CIC): New mode. * config/arm/arm.md (adddi3_compareV, *addsi3_compareV_upper, adddi3_compareC, *addsi3_compareC_upper, subdi3_compare1, subsi3_carryin_compare, subsi3_carryin_compare_const, negdi2_compare, *negsi2_carryin_compare, *arm_cmpdi_insn): Fix the CC reg dataflow. Index: gcc/config/arm/arm-modes.def =================================================================== --- gcc/config/arm/arm-modes.def (revision 244439) +++ gcc/config/arm/arm-modes.def (working copy) @@ -38,6 +38,8 @@ (used for DImode unsigned comparisons). CC_NCVmode should be used if only the N, C, and V flags are correct (used for DImode signed comparisons). + CC_NCV_CICmode defines N and V in SImode and C in DImode + (used for carryin_compare patterns). CCmode should be used otherwise. */ CC_MODE (CC_NOOV); @@ -44,6 +46,7 @@ CC_MODE (CC_Z); CC_MODE (CC_CZ); CC_MODE (CC_NCV); +CC_MODE (CC_NCV_CIC); CC_MODE (CC_SWP); CC_MODE (CCFP); CC_MODE (CCFPE); Index: gcc/config/arm/arm.md =================================================================== --- gcc/config/arm/arm.md (revision 244439) +++ gcc/config/arm/arm.md (working copy) @@ -669,17 +669,15 @@ (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))]) (parallel [(set (reg:CC_V CC_REGNUM) (ne:CC_V - (plus:DI (plus:DI - (sign_extend:DI (match_dup 4)) - (sign_extend:DI (match_dup 5))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) - (plus:DI (sign_extend:DI - (plus:SI (match_dup 4) (match_dup 5))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) - (set (match_dup 3) (plus:SI (plus:SI - (match_dup 4) (match_dup 5)) - (ltu:SI (reg:CC_C CC_REGNUM) - (const_int 0))))])] + (plus:DI (plus:DI (sign_extend:DI (match_dup 4)) + (sign_extend:DI (match_dup 5))) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) + (sign_extend:DI + (plus:SI (plus:SI (match_dup 4) (match_dup 5)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))))) + (set (match_dup 3) (plus:SI (plus:SI (match_dup 4) (match_dup 5)) + (ltu:SI (reg:CC_C CC_REGNUM) + (const_int 0))))])] " { operands[3] = gen_highpart (SImode, operands[0]); @@ -713,13 +711,13 @@ [(set (reg:CC_V CC_REGNUM) (ne:CC_V (plus:DI - (plus:DI - (sign_extend:DI (match_operand:SI 1 "register_operand" "r")) - (sign_extend:DI (match_operand:SI 2 "register_operand" "r"))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) - (plus:DI (sign_extend:DI - (plus:SI (match_dup 1) (match_dup 2))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) + (plus:DI + (sign_extend:DI (match_operand:SI 1 "register_operand" "r")) + (sign_extend:DI (match_operand:SI 2 "register_operand" "r"))) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) + (sign_extend:DI (plus:SI (plus:SI (match_dup 1) (match_dup 2)) + (ltu:SI (reg:CC_C CC_REGNUM) + (const_int 0)))))) (set (match_operand:SI 0 "register_operand" "=r") (plus:SI (plus:SI (match_dup 1) (match_dup 2)) @@ -748,17 +746,15 @@ (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))]) (parallel [(set (reg:CC_C CC_REGNUM) (ne:CC_C - (plus:DI (plus:DI - (zero_extend:DI (match_dup 4)) - (zero_extend:DI (match_dup 5))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) - (plus:DI (zero_extend:DI - (plus:SI (match_dup 4) (match_dup 5))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) - (set (match_dup 3) (plus:SI - (plus:SI (match_dup 4) (match_dup 5)) - (ltu:SI (reg:CC_C CC_REGNUM) - (const_int 0))))])] + (plus:DI (plus:DI (zero_extend:DI (match_dup 4)) + (zero_extend:DI (match_dup 5))) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) + (zero_extend:DI + (plus:SI (plus:SI (match_dup 4) (match_dup 5)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))))) + (set (match_dup 3) (plus:SI (plus:SI (match_dup 4) (match_dup 5)) + (ltu:SI (reg:CC_C CC_REGNUM) + (const_int 0))))])] " { operands[3] = gen_highpart (SImode, operands[0]); @@ -777,17 +773,16 @@ [(set (reg:CC_C CC_REGNUM) (ne:CC_C (plus:DI - (plus:DI - (zero_extend:DI (match_operand:SI 1 "register_operand" "r")) - (zero_extend:DI (match_operand:SI 2 "register_operand" "r"))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) - (plus:DI (zero_extend:DI - (plus:SI (match_dup 1) (match_dup 2))) - (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) + (plus:DI + (zero_extend:DI (match_operand:SI 1 "register_operand" "r")) + (zero_extend:DI (match_operand:SI 2 "register_operand" "r"))) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))) + (zero_extend:DI + (plus:SI (plus:SI (match_dup 1) (match_dup 2)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))))) (set (match_operand:SI 0 "register_operand" "=r") - (plus:SI - (plus:SI (match_dup 1) (match_dup 2)) - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] + (plus:SI (plus:SI (match_dup 1) (match_dup 2)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] "TARGET_32BIT" "adcs%?\\t%0, %1, %2" [(set_attr "conds" "set") @@ -1086,8 +1081,8 @@ }) (define_insn_and_split "subdi3_compare1" - [(set (reg:CC CC_REGNUM) - (compare:CC + [(set (reg:CC_NCV CC_REGNUM) + (compare:CC_NCV (match_operand:DI 1 "register_operand" "r") (match_operand:DI 2 "register_operand" "r"))) (set (match_operand:DI 0 "register_operand" "=&r") @@ -1098,10 +1093,14 @@ [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (match_dup 2))) (set (match_dup 0) (minus:SI (match_dup 1) (match_dup 2)))]) - (parallel [(set (reg:CC CC_REGNUM) - (compare:CC (match_dup 4) (match_dup 5))) - (set (match_dup 3) (minus:SI (minus:SI (match_dup 4) (match_dup 5)) - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (zero_extend:DI (match_dup 4)) + (plus:DI (zero_extend:DI (match_dup 5)) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) + (set (match_dup 3) + (minus:SI (minus:SI (match_dup 4) (match_dup 5)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] { operands[3] = gen_highpart (SImode, operands[0]); operands[0] = gen_lowpart (SImode, operands[0]); @@ -1157,13 +1156,15 @@ ) (define_insn "*subsi3_carryin_compare" - [(set (reg:CC CC_REGNUM) - (compare:CC (match_operand:SI 1 "s_register_operand" "r") - (match_operand:SI 2 "s_register_operand" "r"))) + [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (zero_extend:DI (match_operand:SI 1 "s_register_operand" "r")) + (plus:DI + (zero_extend:DI (match_operand:SI 2 "s_register_operand" "r")) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) (set (match_operand:SI 0 "s_register_operand" "=r") - (minus:SI (minus:SI (match_dup 1) - (match_dup 2)) - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] + (minus:SI (minus:SI (match_dup 1) (match_dup 2)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] "TARGET_32BIT" "sbcs\\t%0, %1, %2" [(set_attr "conds" "set") @@ -1171,13 +1172,15 @@ ) (define_insn "*subsi3_carryin_compare_const" - [(set (reg:CC CC_REGNUM) - (compare:CC (match_operand:SI 1 "reg_or_int_operand" "r") - (match_operand:SI 2 "arm_not_operand" "K"))) + [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (zero_extend:DI (match_operand:SI 1 "reg_or_int_operand" "r")) + (plus:DI + (zero_extend:DI (match_operand:SI 2 "arm_not_operand" "K")) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) (set (match_operand:SI 0 "s_register_operand" "=r") - (minus:SI (plus:SI (match_dup 1) - (match_dup 2)) - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] + (minus:SI (plus:SI (match_dup 1) (match_dup 2)) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] "TARGET_32BIT" "sbcs\\t%0, %1, #%B2" [(set_attr "conds" "set") @@ -4634,8 +4637,8 @@ (define_insn_and_split "negdi2_compare" - [(set (reg:CC CC_REGNUM) - (compare:CC + [(set (reg:CC_NCV CC_REGNUM) + (compare:CC_NCV (const_int 0) (match_operand:DI 1 "register_operand" "0,r"))) (set (match_operand:DI 0 "register_operand" "=r,&r") @@ -4647,8 +4650,12 @@ (compare:CC (const_int 0) (match_dup 1))) (set (match_dup 0) (minus:SI (const_int 0) (match_dup 1)))]) - (parallel [(set (reg:CC CC_REGNUM) - (compare:CC (const_int 0) (match_dup 3))) + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (const_int 0) + (plus:DI + (zero_extend:DI (match_dup 3)) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) (set (match_dup 2) (minus:SI (minus:SI (const_int 0) (match_dup 3)) @@ -4707,12 +4714,14 @@ ) (define_insn "*negsi2_carryin_compare" - [(set (reg:CC CC_REGNUM) - (compare:CC (const_int 0) - (match_operand:SI 1 "s_register_operand" "r"))) + [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (const_int 0) + (plus:DI + (zero_extend:DI (match_operand:SI 1 "s_register_operand" "r")) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) (set (match_operand:SI 0 "s_register_operand" "=r") - (minus:SI (minus:SI (const_int 0) - (match_dup 1)) + (minus:SI (minus:SI (const_int 0) (match_dup 1)) (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))] "TARGET_ARM" "rscs\\t%0, %1, #0" @@ -7361,12 +7370,15 @@ "#" ; "cmp\\t%Q0, %Q1\;sbcs\\t%2, %R0, %R1" "&& reload_completed" [(set (reg:CC CC_REGNUM) - (compare:CC (match_dup 0) (match_dup 1))) - (parallel [(set (reg:CC CC_REGNUM) - (compare:CC (match_dup 3) (match_dup 4))) - (set (match_dup 2) - (minus:SI (match_dup 5) - (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] + (compare:CC (match_dup 0) (match_dup 1))) + (parallel [(set (reg:CC_NCV_CIC CC_REGNUM) + (compare:CC_NCV_CIC + (zero_extend:DI (match_dup 3)) + (plus:DI (zero_extend:DI (match_dup 4)) + (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0))))) + (set (match_dup 2) + (minus:SI (match_dup 5) + (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])] { operands[3] = gen_highpart (SImode, operands[0]); operands[0] = gen_lowpart (SImode, operands[0]);