On Mon, Mar 3, 2014 at 11:01 PM, Richard Sandiford <rdsandif...@googlemail.com> wrote: > AIUI: > > (1) The main point of http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01172.html > was that we had: > > emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx); > /* This might change the hard frame pointer in ways that aren't > apparent to early optimization passes, so force a clobber. */ > emit_clobber (hard_frame_pointer_rtx); > > and, after elimination, the clobber killed the assignment to the > hard frame pointer. Which it could. :-) > > (2) Aassuming for simplicity that STARTING_FRAME_OFFSET == 0, we end up > with an assignment like: > > (set frame_pointer_rtx hard_frame_pointer_rtx) > > And the problem is that frame_pointer_rtx often isn't a real register: > it gets eliminated to hard_frame_pointer_rtx+X, where X isn't known > until RA time. I.e. the assignment is really: > > (set (plus hard_frame_pointer_rtx X) hard_frame_pointer_rtx) > > which becomes: > > (set hard_frame_pointer_rtx (plus hard_frame_pointer_rtx -X)) > > Before RA it isn't obvious that the assignment clobbers > hard_frame_pointer_rtx, so it would seem to most passes that > frame_pointer_rtx and hard_frame_pointer_rtx are equivalent > after the set. That was why the clobber was there. > > (3) We chose to fix this by deleting the explicit clobber and relying on > the magicness of unspec_volatile to imply the clobber instead. > > But I don't think (3) is a good idea. We should instead fix the dataflow > so that it's accurate. > > Long term it would be good to have a different representation for (2),
An UNSPEC that sets frame_pointer_rtx, uses and clobbers hard_frame_pointer_rtx. In fact the plain move_insn is a lie and the clobber should have been at least in a parallel with the set of frame_pointer_rtx ... ? > but I don't have any bright ideas what that might be. Until then I think > we can model the dataflow accurately by having a use as well as a clobber > of hard_frame_pointer_rtx. I went back to r192682: Indeed. > http://gcc.gnu.org/ml/gcc-testresults/2012-10/msg02350.html > > and saw the -m32 tests fail without the builtins.c part of the patch. > They passed after it. I then went to r200643: > > 2013-07-03 Hans-Peter Nilsson <h...@bitrange.com> > > PR middle-end/55030 > * stmt.c (expand_nl_goto_receiver): Remove almost-copy of > expand_builtin_setjmp_receiver. > (expand_label): Adjust, call expand_builtin_setjmp_receiver > with NULL for the label parameter. > * builtins.c (expand_builtin_setjmp_receiver): Don't clobber > the frame-pointer. Adjust comments. > [HAVE_builtin_setjmp_receiver]: Emit builtin_setjmp_receiver > only if LABEL is non-NULL. > > and applied the full patch. The tests still passed, despite the removal > of the volatile checks. (To recap, the volatile checks touched here > were the same ones touched by: > > http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00868.html > > Rather than reverting that part I'm removing the check entirely, > since we seem to agree that the original asm handling wasn't necessary.) > > I'll run a full test overnight, but does this look like it might be > a way out, at least for 4.9? Shouldn't the use and clobber be part of the "move" and thus wrapped inside a parallel? Or am I misunderstanding how parallel works? What keeps those three insns adjacent otherwise? Thansk, Richard. > Thanks, > Richard > > > gcc/ > * builtins.c (expand_builtin_setjmp_receiver): Use and clobber > hard_frame_pointer_rtx. > * cse.c (cse_insn): Remove volatile check. > * cselib.c (cselib_process_insn): Likewise. > * dse.c (scan_insn): Likewise. > > Index: gcc/builtins.c > =================================================================== > --- gcc/builtins.c 2014-03-03 21:47:59.749026019 +0000 > +++ gcc/builtins.c 2014-03-03 21:48:00.550030853 +0000 > @@ -910,18 +910,27 @@ expand_builtin_setjmp_receiver (rtx rece > #ifdef HAVE_nonlocal_goto > if (! HAVE_nonlocal_goto) > #endif > - /* First adjust our frame pointer to its actual value. It was > - previously set to the start of the virtual area corresponding to > - the stacked variables when we branched here and now needs to be > - adjusted to the actual hardware fp value. > - > - Assignments to virtual registers are converted by > - instantiate_virtual_regs into the corresponding assignment > - to the underlying register (fp in this case) that makes > - the original assignment true. > - So the following insn will actually be decrementing fp by > - STARTING_FRAME_OFFSET. */ > - emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx); > + { > + /* First adjust our frame pointer to its actual value. It was > + previously set to the start of the virtual area corresponding to > + the stacked variables when we branched here and now needs to be > + adjusted to the actual hardware fp value. > + > + Assignments to virtual registers are converted by > + instantiate_virtual_regs into the corresponding assignment > + to the underlying register (fp in this case) that makes > + the original assignment true. > + So the following insn will actually be decrementing fp by > + STARTING_FRAME_OFFSET. */ > + emit_move_insn (virtual_stack_vars_rtx, hard_frame_pointer_rtx); > + > + /* Restoring the frame pointer also modifies the hard frame pointer. > + Mark it used (so that the previous assignment remains live once > + the frame pointer is eliminated) and clobbered (to represent the > + implicit update from the assignment). */ > + emit_use (hard_frame_pointer_rtx); > + emit_clobber (hard_frame_pointer_rtx); > + } > > #if !HARD_FRAME_POINTER_IS_ARG_POINTER > if (fixed_regs[ARG_POINTER_REGNUM]) > @@ -965,8 +974,7 @@ expand_builtin_setjmp_receiver (rtx rece > > /* We must not allow the code we just generated to be reordered by > scheduling. Specifically, the update of the frame pointer must > - happen immediately, not later. Similarly, we must block > - (frame-related) register values to be used across this code. */ > + happen immediately, not later. */ > emit_insn (gen_blockage ()); > } > > Index: gcc/cse.c > =================================================================== > --- gcc/cse.c 2014-03-03 21:47:59.869026741 +0000 > +++ gcc/cse.c 2014-03-03 21:48:00.625031305 +0000 > @@ -5664,11 +5664,6 @@ cse_insn (rtx insn) > invalidate (XEXP (dest, 0), GET_MODE (dest)); > } > > - /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything. */ > - if (NONJUMP_INSN_P (insn) > - && volatile_insn_p (PATTERN (insn))) > - flush_hash_table (); > - > /* Don't cse over a call to setjmp; on some machines (eg VAX) > the regs restored by the longjmp come from a later time > than the setjmp. */ > Index: gcc/cselib.c > =================================================================== > --- gcc/cselib.c 2014-03-03 21:47:59.870026748 +0000 > +++ gcc/cselib.c 2014-03-03 21:48:00.626031311 +0000 > @@ -2629,9 +2629,7 @@ cselib_process_insn (rtx insn) > /* Forget everything at a CODE_LABEL, a volatile insn, or a setjmp. */ > if ((LABEL_P (insn) > || (CALL_P (insn) > - && find_reg_note (insn, REG_SETJMP, NULL)) > - || (NONJUMP_INSN_P (insn) > - && volatile_insn_p (PATTERN (insn)))) > + && find_reg_note (insn, REG_SETJMP, NULL))) > && !cselib_preserve_constants) > { > cselib_reset_table (next_uid); > Index: gcc/dse.c > =================================================================== > --- gcc/dse.c 2014-03-03 21:47:59.871026754 +0000 > +++ gcc/dse.c 2014-03-03 21:48:00.627031317 +0000 > @@ -2470,16 +2470,6 @@ scan_insn (bb_info_t bb_info, rtx insn) > return; > } > > - /* Cselib clears the table for this case, so we have to essentially > - do the same. */ > - if (NONJUMP_INSN_P (insn) > - && volatile_insn_p (PATTERN (insn))) > - { > - add_wild_read (bb_info); > - insn_info->cannot_delete = true; > - return; > - } > - > /* Look at all of the uses in the insn. */ > note_uses (&PATTERN (insn), check_mem_read_use, bb_info); >