Re: A plan for eliminating cc0

Alexandre Oliva Thu, 24 Mar 2005 00:12:42 -0800

On Mar 23, 2005, Ian Lance Taylor <ian@airs.com> wrote:

> Of course, correctly modeling the effect on the condition codes
> really means putting the information in the RTL so that it is
> exposed to the RTL optimizers.


True, but we want to avoid that because of the loss of optimization
opportunities.  Besides, there's really no way in GCC (AFAIK) to have
a pattern such that, if reload selects one alternative, you get a
register set in one way; if you select another, you get it set
differently, or even not set at all, and this is what we'd need to be
able to model the correct effect of instructions as common as mov.

>> >    2b) Convert conditional branch instructions from separate
>> >        cmpMODE/bCOND instructions to a single conditional branch
>> >        instruction, either by saving condition codes in cmpMODE or
>> >        tstMODE or by using cbranch.

>> The problem here IIRC is a combinatorial explosion on the number of
>> alternatives for the now-merged compare&branch instructions.  Have a
>> look, for example, at the cc0-setting patterns on h8300.md, including
>> the zero-extend and zero-extract ones.  There are many of them, with
>> operands that aren't easy to get into a single cbranch instruction.

> I'm not proposing any sort of combinatorial explosion.

Yes, you are.  Converting cmpMODE/bCOND to cbranch will involve some
combinatorial explosion in cbranch.  We can use computers to handle
this explosion for us, instead of forcing port maintainers to deal
with all the complexity.

> It is not necessary to combine every condition code setting
> instruction with every branch instruction.

Note that I'm not talking about those that set cc as a side effect;
I'm talking only about those whose main purpose is to set the cc.
Even if you limit yourself to those on say h8sx or mn10300, you still
get a very large number of possibilities, because you have to combine
tst/cmp for all available modes, with and without extends or bit
extracts, with condbranch, for all available conditions.  Ok, the
problem of all available conditions is easy to overcome, but you still
have the problem of having to duplicate a dozen or so test/cmp
patterns in the patterns that combine them with branches.

The proposal that RTH and I sort of came up with years ago was meant
to overcome this very duplication issue.  See below.

>> I think a better approach would be to enable cc0 to be used in
>> patterns, but have some machinery behind the scenes that combines
>> expanders for cc0-setting and cc0-using instructions, creates the
>> post-reload splitters and makes the actual insns conditional on
>> reload_completed.  Then one can still model the port in the most
>> natural way, but not overburden the rest of the compiler with cc0
>> support.

> Can you spell that out in more detail, up to the point where somebody
> could start working on it?  That description doesn't give me enough to
> go on.  How would this machinery be invoked?  How would it be
> implemented?

The idea is to get the programs that process .md files such that, when
it reads something like:

;; cc0-setter
(define_insn "cmpsi"
  [(set (cc0)
        (compare (match_operand:SI 0 "register_operand" "!*d*a*x,dax")
                 (match_operand:SI 1 "nonmemory_operand" "*0,daxi")))]
  "<cond1>"
  "@
  btst 0,d0
  cmp %1,%0"
  [(set_attr "cc" "compare,compare")])

;; cc0-user
(define_insn "condbranch"
  [(set (pc)
        (if_then_else (match_operator 1 "comparison_operator"
                                      [(cc0) (const_int 0)])
                      (label_ref (match_operand 0 "" ""))
                      (pc)))]
  "<cond2>"
  "*
{
  if (cc_status.mdep.fpCC)
    return \"fb%b1 %0\";
  if ((cc_status.flags & CC_OVERFLOW_UNUSABLE) != 0
      && (GET_CODE (operands[1]) == GT
          || GET_CODE (operands[1]) == GE
          || GET_CODE (operands[1]) == LE
          || GET_CODE (operands[1]) == LT))
    return 0;
  return \"b%b1 %0\";
}"
 [(set_attr "cc" "none")])


it interprets it as something like:

(define_insn "cmpsi"
  [(set (cc0)
        (compare (match_operand:SI 0 "register_operand" "!*d*a*x,dax")
                 (match_operand:SI 1 "nonmemory_operand" "*0,daxi")))]
  "(! before_cc0_collapsing || reload_completed) && <cond1>"
  "@
  btst 0,d0
  cmp %1,%0"
  [(set_attr "cc" "compare,compare")])

(define_insn "condbranch"
  [(set (pc)
        (if_then_else (match_operator 1 "comparison_operator"
                                      [(cc0) (const_int 0)])
                      (label_ref (match_operand 0 "" ""))
                      (pc)))]
  "(before_cc0_collapsing || reload_completed) && <cond2>"
  "*
{
  if (cc_status.mdep.fpCC)
    return \"fb%b1 %0\";
  if ((cc_status.flags & CC_OVERFLOW_UNUSABLE) != 0
      && (GET_CODE (operands[1]) == GT
          || GET_CODE (operands[1]) == GE
          || GET_CODE (operands[1]) == LE
          || GET_CODE (operands[1]) == LT))
    return 0;
  return \"b%b1 %0\";
}"
 [(set_attr "cc" "none")])

(define_insn_and_split "cc0_condbranch_cmpsi"
  [(sequence [
    (set (cc0)
        (compare (match_operand:SI 0 "register_operand" "!*d*a*x,dax")
                 (match_operand:SI 1 "nonmemory_operand" "*0,daxi")))
    (set (pc)
        (if_then_else (match_operator 3 "comparison_operator"
                                      [(cc0) (const_int 0)])
                      (label_ref (match_operand 2 "" ""))
                      (pc)))]
  )]
  "<cond1>(operands) && <cond2>(operands+2)"
  "#"
  "reload_completed"
  [(set (cc0)
        (compare (match_dup 0)
                 (match_dup 1)
    (set (pc)
        (if_then_else (match_op_dup 3 "comparison_operator"
                                      [(cc0) (const_int 0)])
                      (label_ref (match_dup 2))
                      (pc)))])


The code implementing cond1 and cond2 would have to be defined in
functions, such that they could take shifted operands as arguments.
We'd create such a combined pattern for every pair of cc0-setter and
cc0-user.

We'd add a pass right after rtl generation to go through the
instruction stream looking for cc0 setters and users, and merging them
into a single combined pattern, perhaps reusing the peephole
mechanisms, generating peepholes that reverse the combined
insn_and_split above.

I don't think any ports would require sequences of more than two
instructions to be combined into a single pattern (e.g., have
instructions that meaningfully both use and set cc0).  Supporting this
would be more of a pain, and would require an alternate, more complex
approach than pre-generating the combined patterns.

I realize the sequence construct is already taken for delayed
branches, but that's only in the outermost insn pattern.  We could
overload the meaning, or just enclose the (sequence) in some other
construct (a dummy parallel?), give it a different mode (a CC mode?)
or something else to indicate that this is a combined pattern.  Or
create a different rtl code for cc0 sequences of instructions.  The
syntax is not all that important at this point.

As for the cc attribute, we'd start from:

(define_insn "*movsi_impl"
  [(set (match_operand:SI 0 "nonimmediate_operand"
                                
"=dx,ax,dx,a,dxm,dxm,axm,axm,dx,dx,ax,ax,axR,!*y,*f,*f,dxaQ")
        (match_operand:SI 1 "general_operand"
                                
"0,0,I,I,dx,ax,dx,ax,dixm,aixm,dixm,aixm,!*y,axR,0,dxaQi*f,*f"))]
  "register_operand (operands[0], SImode)
   || register_operand (operands[1], SImode)"
  "*
{
  switch (which_alternative)
    {
[...]
    }
}"
  [(set_attr "cc" 
"none,none,clobber,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none,none_0hit,none_0hit")])

and generate patterns such as:

(define_insn_and_split "*movsi_impl"
  [(set (match_operand:SI 0 "nonimmediate_operand" "...")
        (match_operand:SI 1 "general_operand" "..."))]
  "! cc0_expanding_completed
   && (register_operand (operands[0], SImode)
       || register_operand (operands[1], SImode))"
  "#"
  "cc0_expanding_in_progress"
  [(set (match_dup 0) (match_dup 1))
   (match_dup 2)]
  "
{
  operands[2] = gen_cc0_set_for (insn);
}"
  [(set_attr "cc" 
"none,none,clobber,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none,none_0hit,none_0hit")])

  
(define_insn "*movsi_impl_cc0_set"
  [(set (match_operand:SI 0 "nonimmediate_operand" "...")
        (match_operand:SI 1 "general_operand" "..."))
   (match_operand 2 "cc0_set_operand" "")]
  "(cc0_expanding_in_progress || cc0_expanding_completed)"
  "*
{
[...]
}"
  [(set_attr "cc" 
"none,none,clobber,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none_0hit,none,none_0hit,none_0hit")])


I'm not sure whether it would make sense to implicitly generate
different insns for all possible values of the cc attribute, and have
different predicates for cc0_set_operand, or whether we could just
accept all possible variations in cc0_set_operand.  I'm not even sure
it actually makes sense to handle anything but clobber, since we're
going to retain the NOTICE_CC_UPDATE machinery unchanged.

One more thing to note is that we have to force splitters to run after
reload, with cc0_expanding_in_progress, such that patterns that don't
have the clobbers or some dummy pattern in its stead don't survive
past the point in which we split cc0 setters and users out of single
combined insns.  It might make sense to use cc0_expanding_in_progress
and cc0_expanding_completed in the conditions of the modified cmpsi
and condbranch patterns above, but it doesn't seem to be necessary.

The port would have to implement the gen_cc0_set function (or just a
cc0_set pattern) and the cc0_set_operand predicate, but these should
be fairly trivial to add given knowledge about the attributes.

Ideally, we should have some means to avoid duplicating all insns with
and without the cc0_set operand, but I don't see a simple way to tell
whether an attribute holds a value for all alternatives of an insn, in
the general case.

>> Why not use the relatively-common-in-cc0-ports
>> cc attribute for this purpose?

> That would be fine.  It would be trivial to define clobbercc in terms
> of cc for ports which have it.

See the problem of alternatives above.  Would it not be a problem in
your proposal?

-- 
Alexandre Oliva             http://www.ic.unicamp.br/~oliva/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}

Re: A plan for eliminating cc0

Reply via email to