https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53513
Oleg Endo <olegendo at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #33691|0 |1 is obsolete| | --- Comment #23 from Oleg Endo <olegendo at gcc dot gnu.org> --- Created attachment 33716 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33716&action=edit Using virtual FPSCR registers to model insn dependencies This is a somewhat larger patch which does a couple of things to address the deficits of the previous approach. 1) The hard reg set is extended by two more registers: - FPSCR_MODES_REG represents any mode bits in FPSCR, which are used by fp insns. - FPSCR_STAT_REG represents the status bits in FPSCR, which are written by fp insns. All fp insns that previously had a (use (match_operand:PSI 2 "fpscr_operand" "") now get a (use (reg:SI FPSCR_MODES_REG)) 2) The 'fpu_switch' insn, which is just a FPSCR load/store, now uses and sets the virtual regs FPSCR_MODES_REG and FPSCR_STAT_REG. This creates a sort of reordering barrier due to the reg uses/sets. 3) Two new dummy insns extend_psi_si and truncate_si_psi are added to convert PSIMode <-> SImode, because we can do only logic on SImode, but FPSCR is described as PSImode. This hack could be removed later maybe. 4) The insns toggle_sz and toggle_pr are now also setting the virtual reg FPSCR_MODES_REG. 5) The fsca insn needed some adjustments for the operand matching, as combine started going different paths. Some of the fsca tests in gcc.target/sh were failing and are fixed by this. 6) sh_emit_set_t_insn is adjusted to emit the correct comparison insns with use/set of virtual FPSCR regs. 7) calc_live_regs is adjusted to ignore virtual regs FPSCR_MODES_REG and FPSCR_STAT_REG, or else it will try to generate push/pop insns for those regs. 8) sh_emit_mode_set now emits the FPSCR store-modify-load insn sequence instead of loading FPSCR from __fpscr_values. 9) Some redundant fp insns and related functions are deleted. I've tried to keep the patch as short as possible and not do too many unrelated changes. I haven't fully tested the patch, so there are probably a couple of fallouts. There are two regressions with this patch. The FPSCR loads are not put into delay slots anymore. Probably because the 'fpu_switch' pattern now has too many sets and the DBR rejects that as a slot candidate. The other thing are the failures in gcc.target/sh w.r.t. interrupt functions, which can be addressed later. Kaz, could you please have an early look at it?