Re: A plan for eliminating cc0

Hans-Peter Nilsson Tue, 29 Mar 2005 19:13:11 -0800

I'm behind on reading mailing lists and only "skipped ahead" for
this thread.  (I may have missed some related follow-ups.)


> From: Ian Lance Taylor <[email protected]>
> Date: 24 Mar 2005 11:44:52 -0500

> 1) Modify the programs which read the .md file to look for an
>    attribute named clobbercc.  If such an attribute exists, then for
>    any instruction pattern which defines clobbercc as "yes" (or
>    "true", or whatever), automatically add "(clobber (reg:CC CC_REG))"
>    to the instruction.

I had a similar but not substantiated plan, where I named the
attribute "cc0" (that name is not used by backends; values
matching yours would be "clobber" and "unmodified" or "none").
The "cc0" name seems better and more intuitive than "clobbercc"
which implies a binary value.  It'd be easier language-wise to
add handling for values like "set_dest" (cc0 set according to
destination); the attribute name wouldn't then need tweaking to
make sense later on.  Similarly but perhaps more importantly for
the cc_status.flags handling; translating other cc0 attribute
values to machine-generated CCmodes.

Working on a plan I got stuck on a couple of things, most
importantly the cost/benefit below, IMHO a show-stopper.

Another point was how to get a usable register number for cc0
and associated.  Leaving it to the target to define makes sense
but it also means there's some work to adjust register classes
etc; an area where bugs get nasty (i.e. hard to determine the
cause from the effect).  I wish there was a gap between the hard
registers and the first pseudo register.

> 2) Modify the programs which read the .md file to look for
>    instructions which set cc0 and instructions which use cc0.  If
>    CC_REG is defined for the backend, then for each such instruction:
...

Makes sense, as long as *everything* in (2) happens in the gen*
programs.

> 4) For each target which uses cc0:
...
>    4b) For insn patterns for which some alternatives clobber CC and
>        some do not, split the instruction after reload into one
>        variant which clobbers the CC and one variant which does not.
>        Or just write different patterns which are only recognized
>        after reload.

An IMHO show-stopping problem with 4b is that writing all
matching variants is lots of work; the conversion cost in total
would not be significantly less work for a target than "brute
force" conversion (like the one done in the i386 port) and you'd
get pattern explosion in the .md.  At least if you want to avoid
a performance regression.

> 5) Write a new optimization pass enabled on targets which define
>    NOTICE_UPDATE_CC.  I think this pass would be run just before
>    machine dependent reorg, although perhaps there is a better place
>    for it.  Walk through the instructions, calling NOTICE_UPDATE_CC on
>    each one.  When we find an instruction which sets CC_REG, check the
>    source of the set with the current CC status, just as
>    final_scan_insn does now.  If the current CC status is the same,
>    delete the instruction which sets CC_REG.

(Needs to be augmented with:) ... but change the noticed-insn
(into a parallel if not already one) that also sets CC_REG, and
check that the resulting parallel insn is recognized before
going ahead with removing the original cc0-setter.  (If you
don't do this step, you'll get in trouble with missing sets of
CC_REG for flow info, and unrecognized-insn ICEs.)  If the
resulting insn is unrecognized && ifdef ENABLE_CHECKING, emit a
warning, so the target maintainer notices the opportunity for
optimization (i.e. which was there with the "old"
NOTICE_UPDATE_CC handling).

> At this point, the generated code quality should be approximately the
> same as when the target used cc0.

Not without pattern explosion for the additional cc0-setter
combinations.  Also, extra tweaks would be needed to make up for
the missing handling of e.g. cc_status.flags.  I suggest that
(it's ok to admit that) the approach you suggest is likely to
get decreased performance (without massive target tweaking), but
is a suggested easy-way-out for target maintainers and absence
of interested target maintainers.

> I want to stress that that this approach is intended to permit
> reasonably simple elimination of cc0 for all targets.  It does not
> preclude any particular target from using a different approach.

Yes, this is good, a sign of of a sane approach. :-)

> I don't think that a more accurate representation will help very much
> without a lot more work, because the optimizers won't really be able
> to use the better representation until after reload, and we don't do
> very much optimization after reload.  Specifically, I don't think a
> better representation will improve very much on the proposed
> NOTICE_UPDATE_CC optimization pass.  But nothing in what I am
> suggesting precludes following this path.

Except for the clobbercc *name*. :-)

But I think this suggestion and similar ones along this line is
doing too much for too little gain.  To wit, if you want
matching performance, you still have pattern explosion in the
.md as noted above.  Even if performance is unimportant, you
*still* need to tweak the .md and also the .h register macros.

A basic likely-pessimizing conversion is about the same amount
of work as that; delete sCC patterns and other not-strictly-
necessary cc0 users, change remaining users (only branches IIRC)
to cbranch without post-reload splitters.  (My preferred
alternative, already known but frequently reinvented. :-)

So, I don't think the added infrastructure pays off, initially
or maintenance-wise.  But I guess that argument holds less if
the added machinery is for free (i.e. Someone Else writes
it. :-)

I haven't come up with anything substantial for the cc0 issue in
general myself but FWIW I do have a "migration path" ready for
the CRIS target: brute force followed by damage control for the
pattern explosion by "adding some more rtl-macro machinery".
But I don't know *what* added machinery.  I played with some
ideas but concluded that I couldn't defend any proposal until
I've actually done that conversion.  The current mode- and
code-macros (misnomer; actually, mode- and code-*iterators*;
they're an inside-out version of macros) don't help with whole
sequences, like *for example* one of:
 (clobber REG_CC)
 (set (reg:CC) (match_operand:0 ...)
 (set (reg:CC) (whatever SET_SRC, unless overlap with SET_DEST)
 (nothing)
I believe this would lead to some simple extension for
define_code_macro, but I'm not sure of exactly what, or if
there's a greater target-maintenance benefit with some other
change.

The only thing I'm reasonably sure of, is that it'd help if the
gen* machinery would allow and translate a
 (parallel [(single rtl pattern) (nil)])
into
 (single rtl pattern)

And while we're here, a suggestion for the conversion process:
All cc0 targets must have simulators and newlib ports matching
the simplest simtest-howto.html description (drop-in equivalent
for newlib ok, i.e. expands in its own dir and with toplevel
patches to configury, with publically accessible tarball for the
lib) and corresponding sim "baseboard" description.  Else,
they'll be obsoleted and removed before any CC0 revolution
(removal of HAVE_cc0 in the middle-end).  (Actually the lib+sim
requirement should be applied to all new targets.)  Conversion
for a target is considered successful if no regressions with
"make check".  Listed maintainers would have conversion rights.

brgds, H-P

Re: A plan for eliminating cc0

Reply via email to