> Richard Sandiford <richard.sandif...@linaro.org> wrote on 30/08/2011 > 03:29:26 PM: > > > From: Richard Sandiford <richard.sandif...@linaro.org> > > To: gcc-patches@gcc.gnu.org > > Cc: Ayal Zaks/Haifa/IBM@IBMIL > > Date: 30/08/2011 03:29 PM > > Subject: [4/4] Make SMS schedule register moves > > > > This is the move-scheduling patch itself. It should be fairly > > self-explanatory. Let me know if it isn't, and I'll try to improve > > the commentary. > > > > > One potentially controversial change is to the way we handle moves > > in the prologue and epilogue. The current code uses a conservative
- conservative>>accurate? (Your approach seems to be more "conservative") > > > check to decide which stages need which moves. This check copes > > with values that are live before the loop, and emits some moves > > that aren't actually needed. The code also emits chains of moves > > that can be trivially optimised through propagation. We rely on > > later patches to clean this up for us (and they do). > > > > So, rather than keep a rather complicated check here, I've simply > > emitted the moves for all stages. In principle, that will generate > > a little more dead code than now in cases where chains of two moves > > are needed, but that seems to be very rare (when moves are scheduled). indeed, it's better to simplify code generation and rely on subsequent cleanups. Not sure about generating more (dead?) code in principle; could you elaborate, for the record? > > > > > Richard > > > > > > gcc/ > > * modulo-sched.c (extend_node_sched_params): New function. > > (print_node_sched_params): Print the stage. > > (set_columns_for_row, schedule_reg_move): New functions. > > (set_columns_for_ps): Move up file and use set_columns_for_now. typo: now>>row (unless you intend this to be temporary ;-) > > > (schedule_reg_moves): Call extend_node_sched_params and > > schedule_reg_move. Extend size of uses bitmap. Return false > > if a move could not be scheduled. > > (apply_reg_moves): Don't emit moves here. > > (permute_partial_schedule): Handle register moves. > > (duplicate_insns_of_cycles): Remove for_prolog. Always emit moves. Always emit all moves > > > (generate_prolog_epilog): Update calls accordingly. > > > > Index: gcc/modulo-sched.c As I'm going over the patch, let me make sure I understand what's going on: .. > +/* Try to schedule the move with ps_insn identifier I_REG_MOVE in PS. > + The source of the move is provided by I_MUST_FOLLOW, which has > + already been scheduled. The source y of an x=y move is defined by the previous iteration of the next move y=z (or by the original y=... move->def). We schedule moves one after the other bottom-up, starting from the original move->def moving backwards in cycles. This way, the instruction I_MUST_FOLLOW defining y is always already scheduled when we come to schedule the dependent I_REG_MOVE x=y move. .. > + /* The cyclic lifetime of move->new_reg starts and ends at move->def > + (the instruction that defines move->old_reg). So instruction I_REG_MOVE (new_reg=reg) must be scheduled before the next I_MUST_FOLLOW move/original-def (due to anti-dependence: it overwrites reg), but after the previous instance of I_MUST_FOLLOW (due to true dependence; i.e. account for latency also). Why do moves, except for the one closest to move->def (which is directly dependent upon it, i.e. for whom move->def == I_MUST_FOLLOW), have to worry about move->def at all? (Or have their cyclic lifetimes start and end there?) In general, there's a distinction between the "cycle" an instruction is scheduled at (for the first iteration), which is an absolute arbitrary integer, and the "row" it is placed in the PS, which is between 0 and ii-1, and is simply cycle mod ii. When scheduling, it's clearer to reason about cycles, especially if your window includes a row twice. In addition to the (true+anti) dependences involving I_REG_MOVE with I_MUST_FOLLOW, it has (true+anti) dependences with each use it feeds. We could alternatively set these dependencies of I_REG_MOVE on I_MUST_FOLLOW, and on its uses, and call get_sched_window(). But it's presumably simpler to handle it directly here. Right? Ayal.