On Mar 23, 2005, Ian Lance Taylor <ian@airs.com> wrote: > Of course, correctly modeling the effect on the condition codes > really means putting the information in the RTL so that it is > exposed to the RTL optimizers.
True, but we want to avoid that because of the loss of optimization opportunities. Besides, there's really no way in GCC (AFAIK) to have a pattern such that, if reload selects one alternative, you get a register set in one way; if you select another, you get it set differently, or even not set at all, and this is what we'd need to be able to model the correct effect of instructions as common as mov. >> > 2b) Convert conditional branch instructions from separate >> > cmpMODE/bCOND instructions to a single conditional branch >> > instruction, either by saving condition codes in cmpMODE or >> > tstMODE or by using cbranch. >> The problem here IIRC is a combinatorial explosion on the number of >> alternatives for the now-merged compare&branch instructions. Have a >> look, for example, at the cc0-setting patterns on h8300.md, including >> the zero-extend and zero-extract ones. There are many of them, with >> operands that aren't easy to get into a single cbranch instruction. > I'm not proposing any sort of combinatorial explosion. Yes, you are. Converting cmpMODE/bCOND to cbranch will involve some combinatorial explosion in cbranch. We can use computers to handle this explosion for us, instead of forcing port maintainers to deal with all the complexity. > It is not necessary to combine every condition code setting > instruction with every branch instruction. Note that I'm not talking about those that set cc as a side effect; I'm talking only about those whose main purpose is to set the cc. Even if you limit yourself to those on say h8sx or mn10300, you still get a very large number of possibilities, because you have to combine tst/cmp for all available modes, with and without extends or bit extracts, with condbranch, for all available conditions. Ok, the problem of all available conditions is easy to overcome, but you still have the problem of having to duplicate a dozen or so test/cmp patterns in the patterns that combine them with branches. The proposal that RTH and I sort of came up with years ago was meant to overcome this very duplication issue. See below. >> I think a better approach would be to enable cc0 to be used in >> patterns, but have some machinery behind the scenes that combines >> expanders for cc0-setting and cc0-using instructions, creates the >> post-reload splitters and makes the actual insns conditional on >> reload_completed. Then one can still model the port in the most >> natural way, but not overburden the rest of the compiler with cc0 >> support. > Can you spell that out in more detail, up to the point where somebody > could start working on it? That description doesn't give me enough to > go on. How would this machinery be invoked? How would it be > implemented? The idea is to get the programs that process .md files such that, when it reads something like: ;; cc0-setter (define_insn "cmpsi" [(set (cc0) (compare (match_operand:SI 0 "register_operand" "!*d*a*x,dax") (match_operand:SI 1 "nonmemory_operand" "*0,daxi")))] "<cond1>" "@ btst 0,d0 cmp %1,%0" [(set_attr "cc" "compare,compare")]) ;; cc0-user (define_insn "condbranch" [(set (pc) (if_then_else (match_operator 1 "comparison_operator" [(cc0) (const_int 0)]) (label_ref (match_operand 0 "" "")) (pc)))] "<cond2>" "* { if (cc_status.mdep.fpCC) return \"fb%b1 %0\"; if ((cc_status.flags & CC_OVERFLOW_UNUSABLE) != 0 && (GET_CODE (operands[1]) == GT || GET_CODE (operands[1]) == GE || GET_CODE (operands[1]) == LE || GET_CODE (operands[1]) == LT)) return 0; return \"b%b1 %0\"; }" [(set_attr "cc" "none")]) it interprets it as something like: (define_insn "cmpsi" [(set (cc0) (compare (match_operand:SI 0 "register_operand" "!*d*a*x,dax") (match_operand:SI 1 "nonmemory_operand" "*0,daxi")))] "(! before_cc0_collapsing || reload_completed) && <cond1>" "@ btst 0,d0 cmp %1,%0" [(set_attr "cc" "compare,compare")]) (define_insn "condbranch" [(set (pc) (if_then_else (match_operator 1 "comparison_operator" [(cc0) (const_int 0)]) (label_ref (match_operand 0 "" "")) (pc)))] "(before_cc0_collapsing || reload_completed) && <cond2>" "* { if (cc_status.mdep.fpCC) return \"fb%b1 %0\"; if ((cc_status.flags & CC_OVERFLOW_UNUSABLE) != 0 && (GET_CODE (operands[1]) == GT || GET_CODE (operands[1]) == GE || GET_CODE (operands[1]) == LE || GET_CODE (operands[1]) == LT)) return 0; return \"b%b1 %0\"; }" [(set_attr "cc" "none")]) (define_insn_and_split "cc0_condbranch_cmpsi" [(sequence [ (set (cc0) (compare (match_operand:SI 0 "register_operand" "!*d*a*x,dax") (match_operand:SI 1 "nonmemory_operand" "*0,daxi"))) (set (pc) (if_then_else (match_operator 3 "comparison_operator" [(cc0) (const_int 0)]) (label_ref (match_operand 2 "" "")) (pc)))] )] "<cond1>(operands) && <cond2>(operands+2)" "#" "reload_completed" [(set (cc0) (compare (match_dup 0) (match_dup 1) (set (pc) (if_then_else (match_op_dup 3 "comparison_operator" [(cc0) (const_int 0)]) (label_ref (match_dup 2)) (pc)))]) The code implementing cond1 and cond2 would have to be defined in functions, such that they could take shifted operands as arguments. We'd create such a combined pattern for every pair of cc0-setter and cc0-user. We'd add a pass right after rtl generation to go through the instruction stream looking for cc0 setters and users, and merging them into a single combined pattern, perhaps reusing the peephole mechanisms, generating peepholes that reverse the combined insn_and_split above. I don't think any ports would require sequences of more than two instructions to be combined into a single pattern (e.g., have instructions that meaningfully both use and set cc0). Supporting this would be more of a pain, and would require an alternate, more complex approach than pre-generating the combined patterns. I realize the sequence construct is already taken for delayed branches, but that's only in the outermost insn pattern. We could overload the meaning, or just enclose the (sequence) in some other construct (a dummy parallel?), give it a different mode (a CC mode?) or something else to indicate that this is a combined pattern. Or create a different rtl code for cc0 sequences of instructions. The syntax is not all that important at this point. As for the cc attribute, we'd start from: (define_insn "*movsi_impl" [(set (match_operand:SI 0 "nonimmediate_operand" "=dx,ax,dx,a,dxm,dxm,axm,axm,dx,dx,ax,ax,axR,!*y,*f,*f,dxaQ") (match_operand:SI 1 "general_operand" "0,0,I,I,dx,ax,dx,ax,dixm,aixm,dixm,aixm,!*y,axR,0,dxaQi*f,*f"))] "register_operand (operands[0], SImode) || register_operand (operands[1], SImode)" "* { switch (which_alternative) { [...] } }" [(set_attr "cc" "none,none,clobber,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none,none_0hit,none_0hit")]) and generate patterns such as: (define_insn_and_split "*movsi_impl" [(set (match_operand:SI 0 "nonimmediate_operand" "...") (match_operand:SI 1 "general_operand" "..."))] "! cc0_expanding_completed && (register_operand (operands[0], SImode) || register_operand (operands[1], SImode))" "#" "cc0_expanding_in_progress" [(set (match_dup 0) (match_dup 1)) (match_dup 2)] " { operands[2] = gen_cc0_set_for (insn); }" [(set_attr "cc" "none,none,clobber,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none,none_0hit,none_0hit")]) (define_insn "*movsi_impl_cc0_set" [(set (match_operand:SI 0 "nonimmediate_operand" "...") (match_operand:SI 1 "general_operand" "...")) (match_operand 2 "cc0_set_operand" "")] "(cc0_expanding_in_progress || cc0_expanding_completed)" "* { [...] }" [(set_attr "cc" "none,none,clobber,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none,none_0hit,none_0hit")]) I'm not sure whether it would make sense to implicitly generate different insns for all possible values of the cc attribute, and have different predicates for cc0_set_operand, or whether we could just accept all possible variations in cc0_set_operand. I'm not even sure it actually makes sense to handle anything but clobber, since we're going to retain the NOTICE_CC_UPDATE machinery unchanged. One more thing to note is that we have to force splitters to run after reload, with cc0_expanding_in_progress, such that patterns that don't have the clobbers or some dummy pattern in its stead don't survive past the point in which we split cc0 setters and users out of single combined insns. It might make sense to use cc0_expanding_in_progress and cc0_expanding_completed in the conditions of the modified cmpsi and condbranch patterns above, but it doesn't seem to be necessary. The port would have to implement the gen_cc0_set function (or just a cc0_set pattern) and the cc0_set_operand predicate, but these should be fairly trivial to add given knowledge about the attributes. Ideally, we should have some means to avoid duplicating all insns with and without the cc0_set operand, but I don't see a simple way to tell whether an attribute holds a value for all alternatives of an insn, in the general case. >> Why not use the relatively-common-in-cc0-ports >> cc attribute for this purpose? > That would be fine. It would be trivial to define clobbercc in terms > of cc for ports which have it. See the problem of alternatives above. Would it not be a problem in your proposal? -- Alexandre Oliva http://www.ic.unicamp.br/~oliva/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org} Free Software Evangelist [EMAIL PROTECTED], gnu.org}