"Richard Earnshaw (lists)" <richard.earns...@arm.com> writes: > On 06/04/2020 12:19, Richard Sandiford wrote: >> "Richard Earnshaw (lists)" <richard.earns...@arm.com> writes: >>> On 03/04/2020 16:03, Richard Sandiford wrote: >>>> "Richard Earnshaw (lists)" <richard.earns...@arm.com> writes: >>>>> On 03/04/2020 13:27, Richard Sandiford wrote: >>>>>> "Richard Earnshaw (lists)" <richard.earns...@arm.com> writes: >>>>>>> On 02/04/2020 19:53, Richard Henderson via Gcc-patches wrote: >>>>>>>> This is attacking case 3 of PR 94174. >>>>>>>> >>>>>>>> In v2, I unify the various subtract-with-borrow and add-with-carry >>>>>>>> patterns that also output flags with unspecs. As suggested by >>>>>>>> Richard Sandiford during review of v1. It does seem cleaner. >>>>>>>> >>>>>>> >>>>>>> Really? I didn't need to use any unspecs for the Arm version of this. >>>>>>> >>>>>>> R. >>>>>> >>>>>> See https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543063.html >>>>>> (including quoted context) for how we got here. >>>>>> >>>>>> The same problem affects the existing aarch64 patterns like >>>>>> *usub<GPI:mode>3_carryinC. Although that pattern avoids unspecs, >>>>>> the compare:CC doesn't seem to be correct. >>>>>> >>>>>> Richard >>>>>> >>>>> >>>>> But I don't think you can use ANY_EXTEND in these comparisons. It >>>>> doesn't describe what the instruction does, since the instruction does >>>>> not really extend the values first. >>>> >>>> Yeah, that was the starting point in the thread above too. And using >>>> zero_extend in the existing *usub<GPI:mode>3_carryinC pattern: >>>> >>>> (define_insn "*usub<GPI:mode>3_carryinC" >>>> [(set (reg:CC CC_REGNUM) >>>> (compare:CC >>>> (zero_extend:<DWI> >>>> (match_operand:GPI 1 "register_operand" "r")) >>>> (plus:<DWI> >>>> (zero_extend:<DWI> >>>> (match_operand:GPI 2 "register_operand" "r")) >>>> (match_operand:<DWI> 3 "aarch64_borrow_operation" "")))) >>>> (set (match_operand:GPI 0 "register_operand" "=r") >>>> (minus:GPI >>>> (minus:GPI (match_dup 1) (match_dup 2)) >>>> (match_operand:GPI 4 "aarch64_borrow_operation" "")))] >>>> "" >>>> "sbcs\\t%<w>0, %<w>1, %<w>2" >>>> [(set_attr "type" "adc_reg")] >>>> ) >>>> >>>> looks wrong for the same reason. But the main problem IMO isn't how the >>>> inputs to the compare:CC are represented, but that we're using compare:CC >>>> at all. Using compare doesn't accurately model the effect of SBCS on NZCV >>>> for all inputs, so if we're going to use a compare here, it can't be :CC. >>>> >>>>> I would really expect this patch series to be pretty much a dual of this >>>>> series that I posted last year for Arm. >>>>> >>>>> https://gcc.gnu.org/pipermail/gcc-patches/2019-October/532180.html >>>> >>>> That series uses compares with modes like CC_V and CC_B, so I think >>>> you're saying that given the choice in the earlier thread between adding >>>> a new CC mode or using unspecs, you would have preferred a new CC mode, >>>> is that right? >>>> >>> >>> Yes. It surprised me, when doing the aarch32 version, just how often >>> the mid-end parts of the compiler were able to reason about parts of the >>> parallel insn and optimize things accordingly (eg propagating the truth >>> of the comparison). If you use an unspec that can never happen. >> >> That could be changed though. E.g. we could add something like a >> simplify_unspec target hook if this becomes a problem (either here >> or for other unspecs). A fancy implementation could even use >> match.pd-style rules in the .md file. > > I really don't like that. It sounds like the top of a long slippery > slope. What about all the other cases where the RTL is comprehended by > the mid-end?
Is it really so bad though? It's similar to how frontends can define and fold their own trees, and how targets can define and fold their own built-in functions. The CC modes are also quite heavily macro/hook-based. >> The reason I'm not keen on using special modes for this case is that >> they'd describe one way in which the result can be used rather than >> describing what the instruction actually does. The instruction really >> does set all four flags to useful values. The "problem" is that they're >> not the values associated with a compare between two values, so representing >> them that way will always lose information. >> > > Yes, it's true that the rtl -> machine instruction transform is not 100% > reversible. That's always been the case, but it's the price we pay for > a generic IL that describes instructions on multiple architectures. It's not really reversibility that I'm after (at least not for its own sake). If we had a three-input compare_cc rtx_code that described a comparison involving a carry input, we'd certainly be using it here, because that's what the instruction does. Given that we don't have the rtx_code, three obvious choices are: (1) Add it. (2) Continue to represent what the instruction does using an unspec. (3) Don't try to represent the "three-input compare_cc" operation and instead describe a two-input comparison that only yields a valid result for a subset of tests. (1) seems like the best technical solution but would probably be a lot of work. I guess the reason I like (2) is that it stays closest to (1). Thanks, Richard