Richard Biener <richard.guent...@gmail.com> writes: > On Mon, Jun 24, 2024 at 10:03 AM Richard Sandiford > <richard.sandif...@arm.com> wrote: >> >> Richard Biener <richard.guent...@gmail.com> writes: >> > On Sat, Jun 22, 2024 at 6:50 PM Richard Sandiford >> >> The traditional (and IMO correct) way to handle this is to make the >> >> pattern reserve the temporary registers that it needs, using >> >> match_scratches. >> >> rs6000 has many examples of this. E.g.: >> >> >> >> (define_insn_and_split "@ieee_128bit_vsx_neg<mode>2" >> >> [(set (match_operand:IEEE128 0 "register_operand" "=wa") >> >> (neg:IEEE128 (match_operand:IEEE128 1 "register_operand" "wa"))) >> >> (clobber (match_scratch:V16QI 2 "=v"))] >> >> "TARGET_FLOAT128_TYPE && !TARGET_FLOAT128_HW" >> >> "#" >> >> "&& 1" >> >> [(parallel [(set (match_dup 0) >> >> (neg:IEEE128 (match_dup 1))) >> >> (use (match_dup 2))])] >> >> { >> >> if (GET_CODE (operands[2]) == SCRATCH) >> >> operands[2] = gen_reg_rtx (V16QImode); >> >> >> >> emit_insn (gen_ieee_128bit_negative_zero (operands[2])); >> >> } >> >> [(set_attr "length" "8") >> >> (set_attr "type" "vecsimple")]) >> >> >> >> Before RA, this is just: >> >> >> >> (set ...) >> >> (clobber (scratch:V16QI)) >> >> >> >> and the split creates a new register. After RA, operand 2 provides >> >> the required temporary register: >> >> >> >> (set ...) >> >> (clobber (reg:V16QI TMP)) >> >> >> >> Another approach is to add can_create_pseudo_p () to the define_insn >> >> condition (rather than the split condition). But IMO that's an ICE >> >> trap, since insns that have already been matched & accepted shouldn't >> >> suddenly become invalid if recog is reattempted later. >> > >> > What about splitting immediately in late-combine? Wouldn't that possibly >> > allow more combinations to immediately happen? >> >> It would be difficult to guarantee termination. Often the split >> instructions can be immediately recombined back to the original >> instruction. Even if we guard against that happening directly, >> it'd be difficult to prove that it can't happen indirectly. >> >> We might also run into issues like PR101523. >> >> Combine uses define_splits (without define_insns) for 3->2 combinations, >> but the current late-combine optimisation is kind-of 1/N+1->1 x N. >> >> Personally, I think we should allow targets to use the .md file to >> define match.pd-style simplification rules involving unspecs, but there >> were objections to that when I last suggested it. > > Isn't that what basically "combine-helper" patterns do to some extent?
Partly, but: (1) It's a big hammer. It means we add all the overhead of a define_insn for something that is only meant to survive between one pass and the next. (2) Unlike match.pd, it isn't designed to be applied iteratively. There is no attempt even in theory to ensure that match helper -> split -> match helper -> split -> ... would terminate. (3) It operates at the level of complete instructions, including e.g. destinations of sets. The kind of rule I had in mind would be aimed at arithmetic simplification, and would operate at the simplify-rtx.cc level. That is, if simplify_foo failed to apply a target-independent rule, it could fall back on an automatically generated target-specific rule, with the requirement/understanding that these rules really should be target-specific. One easy way of enforcing that is to say that at least one side of a production rule must involve an unspec. Richard