I'm behind on reading mailing lists and only "skipped ahead" for this thread. (I may have missed some related follow-ups.)
> From: Ian Lance Taylor <ian@airs.com> > Date: 24 Mar 2005 11:44:52 -0500 > 1) Modify the programs which read the .md file to look for an > attribute named clobbercc. If such an attribute exists, then for > any instruction pattern which defines clobbercc as "yes" (or > "true", or whatever), automatically add "(clobber (reg:CC CC_REG))" > to the instruction. I had a similar but not substantiated plan, where I named the attribute "cc0" (that name is not used by backends; values matching yours would be "clobber" and "unmodified" or "none"). The "cc0" name seems better and more intuitive than "clobbercc" which implies a binary value. It'd be easier language-wise to add handling for values like "set_dest" (cc0 set according to destination); the attribute name wouldn't then need tweaking to make sense later on. Similarly but perhaps more importantly for the cc_status.flags handling; translating other cc0 attribute values to machine-generated CCmodes. Working on a plan I got stuck on a couple of things, most importantly the cost/benefit below, IMHO a show-stopper. Another point was how to get a usable register number for cc0 and associated. Leaving it to the target to define makes sense but it also means there's some work to adjust register classes etc; an area where bugs get nasty (i.e. hard to determine the cause from the effect). I wish there was a gap between the hard registers and the first pseudo register. > 2) Modify the programs which read the .md file to look for > instructions which set cc0 and instructions which use cc0. If > CC_REG is defined for the backend, then for each such instruction: ... Makes sense, as long as *everything* in (2) happens in the gen* programs. > 4) For each target which uses cc0: ... > 4b) For insn patterns for which some alternatives clobber CC and > some do not, split the instruction after reload into one > variant which clobbers the CC and one variant which does not. > Or just write different patterns which are only recognized > after reload. An IMHO show-stopping problem with 4b is that writing all matching variants is lots of work; the conversion cost in total would not be significantly less work for a target than "brute force" conversion (like the one done in the i386 port) and you'd get pattern explosion in the .md. At least if you want to avoid a performance regression. > 5) Write a new optimization pass enabled on targets which define > NOTICE_UPDATE_CC. I think this pass would be run just before > machine dependent reorg, although perhaps there is a better place > for it. Walk through the instructions, calling NOTICE_UPDATE_CC on > each one. When we find an instruction which sets CC_REG, check the > source of the set with the current CC status, just as > final_scan_insn does now. If the current CC status is the same, > delete the instruction which sets CC_REG. (Needs to be augmented with:) ... but change the noticed-insn (into a parallel if not already one) that also sets CC_REG, and check that the resulting parallel insn is recognized before going ahead with removing the original cc0-setter. (If you don't do this step, you'll get in trouble with missing sets of CC_REG for flow info, and unrecognized-insn ICEs.) If the resulting insn is unrecognized && ifdef ENABLE_CHECKING, emit a warning, so the target maintainer notices the opportunity for optimization (i.e. which was there with the "old" NOTICE_UPDATE_CC handling). > At this point, the generated code quality should be approximately the > same as when the target used cc0. Not without pattern explosion for the additional cc0-setter combinations. Also, extra tweaks would be needed to make up for the missing handling of e.g. cc_status.flags. I suggest that (it's ok to admit that) the approach you suggest is likely to get decreased performance (without massive target tweaking), but is a suggested easy-way-out for target maintainers and absence of interested target maintainers. > I want to stress that that this approach is intended to permit > reasonably simple elimination of cc0 for all targets. It does not > preclude any particular target from using a different approach. Yes, this is good, a sign of of a sane approach. :-) > I don't think that a more accurate representation will help very much > without a lot more work, because the optimizers won't really be able > to use the better representation until after reload, and we don't do > very much optimization after reload. Specifically, I don't think a > better representation will improve very much on the proposed > NOTICE_UPDATE_CC optimization pass. But nothing in what I am > suggesting precludes following this path. Except for the clobbercc *name*. :-) But I think this suggestion and similar ones along this line is doing too much for too little gain. To wit, if you want matching performance, you still have pattern explosion in the .md as noted above. Even if performance is unimportant, you *still* need to tweak the .md and also the .h register macros. A basic likely-pessimizing conversion is about the same amount of work as that; delete sCC patterns and other not-strictly- necessary cc0 users, change remaining users (only branches IIRC) to cbranch without post-reload splitters. (My preferred alternative, already known but frequently reinvented. :-) So, I don't think the added infrastructure pays off, initially or maintenance-wise. But I guess that argument holds less if the added machinery is for free (i.e. Someone Else writes it. :-) I haven't come up with anything substantial for the cc0 issue in general myself but FWIW I do have a "migration path" ready for the CRIS target: brute force followed by damage control for the pattern explosion by "adding some more rtl-macro machinery". But I don't know *what* added machinery. I played with some ideas but concluded that I couldn't defend any proposal until I've actually done that conversion. The current mode- and code-macros (misnomer; actually, mode- and code-*iterators*; they're an inside-out version of macros) don't help with whole sequences, like *for example* one of: (clobber REG_CC) (set (reg:CC) (match_operand:0 ...) (set (reg:CC) (whatever SET_SRC, unless overlap with SET_DEST) (nothing) I believe this would lead to some simple extension for define_code_macro, but I'm not sure of exactly what, or if there's a greater target-maintenance benefit with some other change. The only thing I'm reasonably sure of, is that it'd help if the gen* machinery would allow and translate a (parallel [(single rtl pattern) (nil)]) into (single rtl pattern) And while we're here, a suggestion for the conversion process: All cc0 targets must have simulators and newlib ports matching the simplest simtest-howto.html description (drop-in equivalent for newlib ok, i.e. expands in its own dir and with toplevel patches to configury, with publically accessible tarball for the lib) and corresponding sim "baseboard" description. Else, they'll be obsoleted and removed before any CC0 revolution (removal of HAVE_cc0 in the middle-end). (Actually the lib+sim requirement should be applied to all new targets.) Conversion for a target is considered successful if no regressions with "make check". Listed maintainers would have conversion rights. brgds, H-P