Hi Steven, Jeff,
I found a flaw in original patch, which results in inaccurate register
pressure.
Here comes the updated patches, improving code size a little bit on
Thumb2/powerpc comparing to original patches.
Please review.
Thanks
2012-10-16 Bin Cheng <[email protected]>
* gcse.c: Update copyright dates.
(hoist_expr_reaches_here_p): Change parameter type from char *
to sbitmap.
2012-10-16 Bin Cheng <[email protected]>
* common.opt (flag_ira_hoist_pressure): New.
* doc/invoke.texi (-fira-hoist-pressure): Describe.
* ira-costs.c (ira_set_pseudo_classes): New parameter.
* ira.h: Update copyright dates.
(ira_set_pseudo_classes): Update prototype.
* haifa-sched.c (sched_init): Update call.
* ira.c (ira): Update call.
* regmove.c: Update copyright dates.
(regmove_optimize): Update call.
* loop-invariant.c: Update copyright dates.
(move_loop_invariants): Update call.
* gcse.c: (struct bb_data): New structure.
(BB_DATA): New macro.
(curr_bb, curr_reg_pressure): New static variables.
(should_hoist_expr_to_dom): Rename from hoist_expr_reaches_here_p.
Change parameter expr_index to expr.
New parameters pressure_class, nregs and hoisted_bbs.
Use reg pressure to determine the distance expr can be hoisted.
(hoist_code): Use reg pressure to direct the hoist process.
(get_regno_pressure_class, get_pressure_class_and_nregs)
(change_pressure, calculate_bb_reg_pressure): New.
(one_code_hoisting_pass): Calculate register pressure. Allocate
and free data.
gcc/testsuite/ChangeLog
2012-10-16 Bin Cheng <[email protected]>
* testsuite/gcc.dg/hoist-register-pressure.c: New test.
> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
On
> Behalf Of Bin Cheng
> Sent: Friday, October 12, 2012 4:09 PM
> To: 'Steven Bosscher'
> Cc: Jeff Law; [email protected]
> Subject: RE: [PATCH RFA] Implement register pressure directed hoist pass
>
> Hi,
> This is the updated patches split from original one according to Steven's
> suggestion. Also fixed spelling errors.
>
> Apart from this, I also implemented a draft patch simulating register
pressure
> accurately during hoisting, unfortunately the size data isn't better than
this
> patch. If it's right, this can prove my previous observation that decrease
of
> register pressure during hoisting process is rare. I will continue
> investigating the correctness of the patch and see what I can get.
>
> Please review. Thanks
>
> And the ChangeLog:
>
> 2012-10-12 Bin Cheng <[email protected]>
>
> * gcse.c: Update copyright dates.
> (hoist_expr_reaches_here_p): Change parameter type from char *
> to sbitmap.
>
> 2012-10-12 Bin Cheng <[email protected]>
>
> * common.opt (flag_ira_hoist_pressure): New.
> * doc/invoke.texi (-fira-hoist-pressure): Describe.
> * ira-costs.c (ira_set_pseudo_classes): New parameter.
> * ira.h: Update copyright dates.
> (ira_set_pseudo_classes): Update prototype.
> * haifa-sched.c (sched_init): Update call.
> * ira.c (ira): Update call.
> * regmove.c: Update copyright dates.
> (regmove_optimize): Update call.
> * loop-invariant.c: Update copyright dates.
> (move_loop_invariants): Update call.
> * gcse.c: (struct bb_data): New structure.
> (BB_DATA): New macro.
> (curr_bb, curr_reg_pressure): New static variables.
> (should_hoist_expr_to_dom): Rename from hoist_expr_reaches_here_p.
> Change parameter expr_index to expr.
> New parameters pressure_class, nregs and hoisted_bbs.
> Use reg pressure to determine the distance expr can be hoisted.
> (hoist_code): Use reg pressure to direct the hoist process.
> (get_regno_pressure_class, get_pressure_class_and_nregs)
> (change_pressure, calculate_bb_reg_pressure): New.
> (one_code_hoisting_pass): Calculate register pressure. Allocate
> and free data.
>
> gcc/testsuite/ChangeLog
> 2012-10-12 Bin Cheng <[email protected]>
>
> * testsuite/gcc.dg/hoist-register-pressure.c: New test.diff -x .svn -rpN wc1/gcc/common.opt wc2/gcc/common.opt
*** wc1/gcc/common.opt 2012-10-12 15:13:41.679617846 +0800
--- wc2/gcc/common.opt 2012-10-16 15:19:35.231674135 +0800
*************** Enum(ira_region) String(all) Value(IRA_R
*** 1392,1397 ****
--- 1392,1402 ----
EnumValue
Enum(ira_region) String(mixed) Value(IRA_REGION_MIXED)
+ fira-hoist-pressure
+ Common Report Var(flag_ira_hoist_pressure) Init(1) Optimization
+ Use IRA based register pressure calculation
+ in RTL hoist optimizations.
+
fira-loop-pressure
Common Report Var(flag_ira_loop_pressure)
Use IRA based register pressure calculation
diff -x .svn -rpN wc1/gcc/doc/invoke.texi wc2/gcc/doc/invoke.texi
*** wc1/gcc/doc/invoke.texi 2012-10-12 15:13:40.282479595 +0800
--- wc2/gcc/doc/invoke.texi 2012-10-16 15:19:35.239111464 +0800
*************** Objective-C and Objective-C++ Dialects}.
*** 372,378 ****
-finline-small-functions -fipa-cp -fipa-cp-clone @gol
-fipa-pta -fipa-profile -fipa-pure-const -fipa-reference @gol
-fira-algorithm=@var{algorithm} @gol
! -fira-region=@var{region} @gol
-fira-loop-pressure -fno-ira-share-save-slots @gol
-fno-ira-share-spill-slots -fira-verbose=@var{n} @gol
-fivopts -fkeep-inline-functions -fkeep-static-consts @gol
--- 372,378 ----
-finline-small-functions -fipa-cp -fipa-cp-clone @gol
-fipa-pta -fipa-profile -fipa-pure-const -fipa-reference @gol
-fira-algorithm=@var{algorithm} @gol
! -fira-region=@var{region} -fira-hoist-pressure @gol
-fira-loop-pressure -fno-ira-share-save-slots @gol
-fno-ira-share-spill-slots -fira-verbose=@var{n} @gol
-fivopts -fkeep-inline-functions -fkeep-static-consts @gol
*************** This typically results in the smallest c
*** 6996,7001 ****
--- 6996,7009 ----
@end table
+ @item -fira-hoist-pressure
+ @opindex fira-hoist-pressure
+ Use IRA to evaluate register pressure in the code hoisting pass for
+ decisions to hoist expressions. This option usually results in smaller
+ code, but it can slow the compiler down.
+
+ This option is enabled at level @option{-Os} for all targets.
+
@item -fira-loop-pressure
@opindex fira-loop-pressure
Use IRA to evaluate register pressure in loops for decisions to move
diff -x .svn -rpN wc1/gcc/gcse.c wc2/gcc/gcse.c
*** wc1/gcc/gcse.c 2012-10-12 15:45:12.822998377 +0800
--- wc2/gcc/gcse.c 2012-10-16 15:35:30.378109717 +0800
*************** along with GCC; see the file COPYING3.
*** 20,28 ****
/* TODO
- reordering of memory allocation and freeing to be more space efficient
! - do rough calc of how many regs are needed in each block, and a rough
! calc of how many regs are available in each class and use that to
! throttle back the code in cases where RTX_COST is minimal.
*/
/* References searched while implementing this.
--- 20,30 ----
/* TODO
- reordering of memory allocation and freeing to be more space efficient
! - simulate register pressure change of each basic block accurately during
! hoist process. But I doubt the benefit since most expressions hoisted
! are constant or address, which usually won't reduce register pressure.
! - calc rough register pressure information and use the info to drive all
! kinds of code motion (including code hoisting) in a unified way.
*/
/* References searched while implementing this.
*************** along with GCC; see the file COPYING3.
*** 141,151 ****
#include "diagnostic-core.h"
#include "toplev.h"
#include "rtl.h"
#include "tree.h"
#include "tm_p.h"
#include "regs.h"
! #include "hard-reg-set.h"
#include "flags.h"
#include "insn-config.h"
#include "recog.h"
--- 143,154 ----
#include "diagnostic-core.h"
#include "toplev.h"
+ #include "hard-reg-set.h"
#include "rtl.h"
#include "tree.h"
#include "tm_p.h"
#include "regs.h"
! #include "ira.h"
#include "flags.h"
#include "insn-config.h"
#include "recog.h"
*************** static bool doing_code_hoisting_p = fals
*** 412,417 ****
--- 415,436 ----
/* For available exprs */
static sbitmap *ae_kill;
+ /* Data stored for each basic block. */
+ struct bb_data
+ {
+ /* Maximal register pressure inside basic block for given register class
+ (defined only for the pressure classes). */
+ int max_reg_pressure[N_REG_CLASSES];
+ };
+
+ #define BB_DATA(bb) ((struct bb_data *) (bb)->aux)
+
+ static basic_block curr_bb;
+
+ /* Currently register pressure for each pressure class. */
+ static int curr_reg_pressure[N_REG_CLASSES];
+
+
static void compute_can_copy (void);
static void *gmalloc (size_t) ATTRIBUTE_MALLOC;
static void *gcalloc (size_t, size_t) ATTRIBUTE_MALLOC;
*************** static void alloc_code_hoist_mem (int, i
*** 460,468 ****
static void free_code_hoist_mem (void);
static void compute_code_hoist_vbeinout (void);
static void compute_code_hoist_data (void);
! static int hoist_expr_reaches_here_p (basic_block, int, basic_block, sbitmap,
! int, int *);
static int hoist_code (void);
static int one_code_hoisting_pass (void);
static rtx process_insert_insn (struct expr *);
static int pre_edge_insert (struct edge_list *, struct expr **);
--- 479,489 ----
static void free_code_hoist_mem (void);
static void compute_code_hoist_vbeinout (void);
static void compute_code_hoist_data (void);
! static int should_hoist_expr_to_dom (basic_block, struct expr *, basic_block,
! sbitmap, int, int *, enum reg_class,
! int *, bitmap);
static int hoist_code (void);
+ static enum reg_class get_pressure_class_and_nregs (rtx insn, int *nregs);
static int one_code_hoisting_pass (void);
static rtx process_insert_insn (struct expr *);
static int pre_edge_insert (struct edge_list *, struct expr **);
*************** prune_expressions (bool pre_p)
*** 1857,1863 ****
a basic block we should account for any side-effects of a subsequent
jump instructions that could clobber the expression. It would
be best to implement this check along the lines of
! hoist_expr_reaches_here_p where the target block is already known
and, hence, there's no need to conservatively prune expressions on
"intermediate" set-and-jump instructions. */
FOR_EACH_EDGE (e, ei, bb->preds)
--- 1878,1884 ----
a basic block we should account for any side-effects of a subsequent
jump instructions that could clobber the expression. It would
be best to implement this check along the lines of
! should_hoist_expr_to_dom where the target block is already known
and, hence, there's no need to conservatively prune expressions on
"intermediate" set-and-jump instructions. */
FOR_EACH_EDGE (e, ei, bb->preds)
*************** compute_code_hoist_data (void)
*** 2825,2834 ****
fprintf (dump_file, "\n");
}
! /* Determine if the expression identified by EXPR_INDEX would
! reach BB unimpared if it was placed at the end of EXPR_BB.
! Stop the search if the expression would need to be moved more
! than DISTANCE instructions.
It's unclear exactly what Muchnick meant by "unimpared". It seems
to me that the expression must either be computed or transparent in
--- 2846,2866 ----
fprintf (dump_file, "\n");
}
! /* Determine if the expression EXPR should be hoisted to EXPR_BB up in
! flow graph, if it can reach BB unimpared. Stop the search if the
! expression would need to be moved more than DISTANCE instructions.
!
! DISTANCE is the number of instructions through which EXPR can be
! hoisted up in flow graph.
!
! BB_SIZE points to an array which contains the number of instructions
! for each basic block.
!
! PRESSURE_CLASS and NREGS are register class and number of hard registers
! for storing EXPR.
!
! HOISTED_BBS points to a bitmap indicating basic blocks through which
! EXPR is hoisted.
It's unclear exactly what Muchnick meant by "unimpared". It seems
to me that the expression must either be computed or transparent in
*************** compute_code_hoist_data (void)
*** 2841,2858 ****
paths. */
static int
! hoist_expr_reaches_here_p (basic_block expr_bb, int expr_index, basic_block
bb,
! sbitmap visited, int distance, int *bb_size)
{
edge pred;
edge_iterator ei;
int visited_allocated_locally = 0;
/* Terminate the search if distance, for which EXPR is allowed to move,
is exhausted. */
if (distance > 0)
{
! distance -= bb_size[bb->index];
if (distance <= 0)
return 0;
--- 2873,2904 ----
paths. */
static int
! should_hoist_expr_to_dom (basic_block expr_bb, struct expr *expr,
! basic_block bb, sbitmap visited, int distance,
! int *bb_size, enum reg_class pressure_class,
! int *nregs, bitmap hoisted_bbs)
{
+ unsigned int i;
edge pred;
edge_iterator ei;
+ sbitmap_iterator sbi;
int visited_allocated_locally = 0;
/* Terminate the search if distance, for which EXPR is allowed to move,
is exhausted. */
if (distance > 0)
{
! /* Let EXPR be hoisted through basic block at no cost if the block
! has low register pressure. An exception is constant expression,
! because hoisting constant expr aggressively results in worse code.
! The exception is made by the observation of CSiBE on ARM target,
! while it has no obvious effect on other targets like x86, x86_64,
! mips and powerpc. */
! if (!flag_ira_hoist_pressure
! || (BB_DATA (bb)->max_reg_pressure[pressure_class]
! >= ira_class_hard_regs_num[pressure_class]
! || CONST_INT_P (expr->expr)))
! distance -= bb_size[bb->index];
if (distance <= 0)
return 0;
*************** hoist_expr_reaches_here_p (basic_block e
*** 2877,2897 ****
continue;
else if (TEST_BIT (visited, pred_bb->index))
continue;
!
! else if (! TEST_BIT (transp[pred_bb->index], expr_index))
break;
-
/* Not killed. */
else
{
SET_BIT (visited, pred_bb->index);
! if (! hoist_expr_reaches_here_p (expr_bb, expr_index, pred_bb,
! visited, distance, bb_size))
break;
}
}
if (visited_allocated_locally)
! sbitmap_free (visited);
return (pred == NULL);
}
--- 2923,2957 ----
continue;
else if (TEST_BIT (visited, pred_bb->index))
continue;
! else if (! TEST_BIT (transp[pred_bb->index], expr->bitmap_index))
break;
/* Not killed. */
else
{
SET_BIT (visited, pred_bb->index);
! if (! should_hoist_expr_to_dom (expr_bb, expr, pred_bb,
! visited, distance, bb_size,
! pressure_class, nregs, hoisted_bbs))
break;
}
}
if (visited_allocated_locally)
! {
! /* If EXPR can be hoisted to expr_bb, record basic blocks through
! which EXPR is hoisted in hoisted_bbs. Also update register
! pressure for basic blocks newly added in hoisted_bbs. */
! if (flag_ira_hoist_pressure && !pred)
! {
! EXECUTE_IF_SET_IN_SBITMAP (visited, 0, i, sbi)
! if (!bitmap_bit_p (hoisted_bbs, i))
! {
! bitmap_set_bit (hoisted_bbs, i);
! BB_DATA (BASIC_BLOCK (i))->max_reg_pressure[pressure_class]
! += *nregs;
! }
! }
! sbitmap_free (visited);
! }
return (pred == NULL);
}
*************** find_occr_in_bb (struct occr *occr, basi
*** 2908,2914 ****
return occr;
}
! /* Actually perform code hoisting. */
static int
hoist_code (void)
--- 2968,3011 ----
return occr;
}
! /* Actually perform code hoisting.
!
! The code hoisting pass can hoist multiple computations of the same
! expression along dominated path to a dominating basic block, like
! from b2/b3 to b1 as depicted below:
!
! b1 ------
! /\ |
! / \ |
! bx by distance
! / \ |
! / \ |
! b2 b3 ------
!
! Unfortunately code hoisting generally extends the live range of an
! output pseudo register, which increases register pressure and hurts
! register allocation. To address this issue, an attribute MAX_DISTANCE
! is computed and attached to each expression. The attribute is computed
! from rtx cost of the corresponding expression and it's used to control
! how long the expression can be hoisted up in flow graph. As the
! expression is hoisted up in flow graph, GCC decreases its DISTANCE
! and stops the hoist if DISTANCE reaches 0.
!
! Option "-fira-hoist-pressure" implements register pressure directed
! hoist based on upper method. The rationale is:
! 1. Calculate register pressure for each basic block by reusing IRA
! facility.
! 2. When expression is hoisted through one basic block, GCC checks
! register pressure of the basic block and decrease DISTANCE only
! when the register pressure is high. In other words, expression
! will be hoisted through basic block with low register pressure
! at no cost.
! 3. Update register pressure information for basic blocks through
! which expression is hoisted.
! TODO: It is possible to have register pressure decreased because
! of shrinked live ranges of input pseudo registers when hoisting
! an expression. For now, this effect is not simulated and we just
! increase register pressure for hoisted expressions. */
static int
hoist_code (void)
*************** hoist_code (void)
*** 2917,2928 ****
VEC (basic_block, heap) *dom_tree_walk;
unsigned int dom_tree_walk_index;
VEC (basic_block, heap) *domby;
! unsigned int i,j;
struct expr **index_map;
struct expr *expr;
int *to_bb_head;
int *bb_size;
int changed = 0;
/* Compute a mapping from expression number (`bitmap_index') to
hash table entry. */
--- 3014,3031 ----
VEC (basic_block, heap) *dom_tree_walk;
unsigned int dom_tree_walk_index;
VEC (basic_block, heap) *domby;
! unsigned int i, j, k;
struct expr **index_map;
struct expr *expr;
int *to_bb_head;
int *bb_size;
int changed = 0;
+ struct bb_data *data;
+ /* Basic blocks that have occurrences reachable from BB. */
+ bitmap from_bbs;
+ /* Basic blocks through which expr is hoisted. */
+ bitmap hoisted_bbs = NULL;
+ bitmap_iterator bi;
/* Compute a mapping from expression number (`bitmap_index') to
hash table entry. */
*************** hoist_code (void)
*** 2960,2965 ****
--- 3063,3072 ----
&& (EDGE_SUCC (ENTRY_BLOCK_PTR, 0)->dest
== ENTRY_BLOCK_PTR->next_bb));
+ from_bbs = BITMAP_ALLOC (NULL);
+ if (flag_ira_hoist_pressure)
+ hoisted_bbs = BITMAP_ALLOC (NULL);
+
dom_tree_walk = get_all_dominated_blocks (CDI_DOMINATORS,
ENTRY_BLOCK_PTR->next_bb);
*************** hoist_code (void)
*** 2978,2989 ****
{
if (TEST_BIT (hoist_vbeout[bb->index], i))
{
/* Current expression. */
struct expr *expr = index_map[i];
/* Number of occurrences of EXPR that can be hoisted to BB. */
int hoistable = 0;
- /* Basic blocks that have occurrences reachable from BB. */
- bitmap_head _from_bbs, *from_bbs = &_from_bbs;
/* Occurrences reachable from BB. */
VEC (occr_t, heap) *occrs_to_hoist = NULL;
/* We want to insert the expression into BB only once, so
--- 3085,3096 ----
{
if (TEST_BIT (hoist_vbeout[bb->index], i))
{
+ int nregs = 0;
+ enum reg_class pressure_class = NO_REGS;
/* Current expression. */
struct expr *expr = index_map[i];
/* Number of occurrences of EXPR that can be hoisted to BB. */
int hoistable = 0;
/* Occurrences reachable from BB. */
VEC (occr_t, heap) *occrs_to_hoist = NULL;
/* We want to insert the expression into BB only once, so
*************** hoist_code (void)
*** 2991,2998 ****
int insn_inserted_p;
occr_t occr;
- bitmap_initialize (from_bbs, 0);
-
/* If an expression is computed in BB and is available at end of
BB, hoist all occurrences dominated by BB to BB. */
if (TEST_BIT (comp[bb->index], i))
--- 3098,3103 ----
*************** hoist_code (void)
*** 3046,3058 ****
max_distance += (bb_size[dominated->index]
- to_bb_head[INSN_UID (occr->insn)]);
! /* Note if the expression would reach the dominated block
! unimpared if it was placed at the end of BB.
Keep track of how many times this expression is hoistable
from a dominated block into BB. */
! if (hoist_expr_reaches_here_p (bb, i, dominated, NULL,
! max_distance, bb_size))
{
hoistable++;
VEC_safe_push (occr_t, heap,
--- 3151,3168 ----
max_distance += (bb_size[dominated->index]
- to_bb_head[INSN_UID (occr->insn)]);
! pressure_class = get_pressure_class_and_nregs (occr->insn,
! &nregs);
!
! /* Note if the expression should be hoisted from the dominated
! block to BB if it can reach DOMINATED unimpared.
Keep track of how many times this expression is hoistable
from a dominated block into BB. */
! if (should_hoist_expr_to_dom (bb, expr, dominated, NULL,
! max_distance, bb_size,
! pressure_class, &nregs,
! hoisted_bbs))
{
hoistable++;
VEC_safe_push (occr_t, heap,
*************** hoist_code (void)
*** 3093,3098 ****
--- 3203,3230 ----
/* Punt, no point hoisting a single occurence. */
VEC_free (occr_t, heap, occrs_to_hoist);
+ if (flag_ira_hoist_pressure
+ && !VEC_empty (occr_t, occrs_to_hoist))
+ {
+ /* Update register pressure for basic block to which expr
+ is hoisted. */
+ data = BB_DATA (bb);
+ data->max_reg_pressure[pressure_class] += nregs;
+ }
+ else if (flag_ira_hoist_pressure)
+ {
+ /* Restore register pressure of basic block recorded in
+ hoisted_bbs when expr will not be hoisted. */
+ EXECUTE_IF_SET_IN_BITMAP (hoisted_bbs, 0, k, bi)
+ {
+ data = BB_DATA (BASIC_BLOCK (k));
+ data->max_reg_pressure[pressure_class] -= nregs;
+ }
+ }
+
+ if (flag_ira_hoist_pressure)
+ bitmap_clear (hoisted_bbs);
+
insn_inserted_p = 0;
/* Walk through occurrences of I'th expressions we want
*************** hoist_code (void)
*** 3141,3146 ****
--- 3273,3282 ----
}
VEC_free (basic_block, heap, dom_tree_walk);
+ BITMAP_FREE (from_bbs);
+ if (flag_ira_hoist_pressure)
+ BITMAP_FREE (hoisted_bbs);
+
free (bb_size);
free (to_bb_head);
free (index_map);
*************** hoist_code (void)
*** 3148,3153 ****
--- 3284,3448 ----
return changed;
}
+ /* Return pressure class and number of needed hard registers (through
+ *NREGS) of register REGNO. */
+ static enum reg_class
+ get_regno_pressure_class (int regno, int *nregs)
+ {
+ if (regno >= FIRST_PSEUDO_REGISTER)
+ {
+ enum reg_class pressure_class;
+
+ pressure_class = reg_allocno_class (regno);
+ pressure_class = ira_pressure_class_translate[pressure_class];
+ *nregs
+ = ira_reg_class_max_nregs[pressure_class][PSEUDO_REGNO_MODE (regno)];
+ return pressure_class;
+ }
+ else if (! TEST_HARD_REG_BIT (ira_no_alloc_regs, regno)
+ && ! TEST_HARD_REG_BIT (eliminable_regset, regno))
+ {
+ *nregs = 1;
+ return ira_pressure_class_translate[REGNO_REG_CLASS (regno)];
+ }
+ else
+ {
+ *nregs = 0;
+ return NO_REGS;
+ }
+ }
+
+ /* Return pressure class and number of hard registers (through *NREGS)
+ for destination of INSN. */
+ static enum reg_class
+ get_pressure_class_and_nregs (rtx insn, int *nregs)
+ {
+ rtx reg;
+ enum reg_class pressure_class;
+ rtx set = single_set (insn);
+
+ /* Considered invariant insns have only one set. */
+ gcc_assert (set != NULL_RTX);
+ reg = SET_DEST (set);
+ if (GET_CODE (reg) == SUBREG)
+ reg = SUBREG_REG (reg);
+ if (MEM_P (reg))
+ {
+ *nregs = 0;
+ pressure_class = NO_REGS;
+ }
+ else
+ {
+ gcc_assert (REG_P (reg));
+ pressure_class = reg_allocno_class (REGNO (reg));
+ pressure_class = ira_pressure_class_translate[pressure_class];
+ *nregs
+ = ira_reg_class_max_nregs[pressure_class][GET_MODE (SET_SRC (set))];
+ }
+ return pressure_class;
+ }
+
+ /* Increase (if INCR_P) or decrease current register pressure for
+ register REGNO. */
+ static void
+ change_pressure (int regno, bool incr_p)
+ {
+ int nregs;
+ enum reg_class pressure_class;
+
+ pressure_class = get_regno_pressure_class (regno, &nregs);
+ if (! incr_p)
+ curr_reg_pressure[pressure_class] -= nregs;
+ else
+ {
+ curr_reg_pressure[pressure_class] += nregs;
+ if (BB_DATA (curr_bb)->max_reg_pressure[pressure_class]
+ < curr_reg_pressure[pressure_class])
+ BB_DATA (curr_bb)->max_reg_pressure[pressure_class]
+ = curr_reg_pressure[pressure_class];
+ }
+ }
+
+ /* Calculate register pressure for each basic block by walking insns
+ from last to first. */
+ static void
+ calculate_bb_reg_pressure (void)
+ {
+ int i;
+ unsigned int j;
+ rtx insn;
+ basic_block bb;
+ bitmap curr_regs_live;
+ bitmap_iterator bi;
+
+
+ ira_setup_eliminable_regset ();
+ curr_regs_live = BITMAP_ALLOC (®_obstack);
+ FOR_EACH_BB (bb)
+ {
+ curr_bb = bb;
+ bitmap_copy (curr_regs_live, DF_LR_OUT (bb));
+ for (i = 0; i < ira_pressure_classes_num; i++)
+ curr_reg_pressure[ira_pressure_classes[i]] = 0;
+ EXECUTE_IF_SET_IN_BITMAP (curr_regs_live, 0, j, bi)
+ change_pressure (j, true);
+
+ FOR_BB_INSNS_REVERSE (bb, insn)
+ {
+ rtx dreg;
+ int regno;
+ df_ref *def_rec, *use_rec;
+
+ if (! NONDEBUG_INSN_P (insn))
+ continue;
+
+ for (def_rec = DF_INSN_DEFS (insn); *def_rec; def_rec++)
+ {
+ dreg = DF_REF_REAL_REG (*def_rec);
+ gcc_assert (REG_P (dreg));
+ regno = REGNO (dreg);
+ if (!(DF_REF_FLAGS (*def_rec)
+ & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)))
+ {
+ if (bitmap_clear_bit (curr_regs_live, regno))
+ change_pressure (regno, false);
+ }
+ }
+
+ for (use_rec = DF_INSN_USES (insn); *use_rec; use_rec++)
+ {
+ dreg = DF_REF_REAL_REG (*use_rec);
+ gcc_assert (REG_P (dreg));
+ regno = REGNO (dreg);
+ if (bitmap_set_bit (curr_regs_live, regno))
+ change_pressure (regno, true);
+ }
+ }
+ }
+ BITMAP_FREE (curr_regs_live);
+
+ if (dump_file == NULL)
+ return;
+
+ fprintf (dump_file, "\nRegister Pressure: \n");
+ FOR_EACH_BB (bb)
+ {
+ fprintf (dump_file, " Basic block %d: \n", bb->index);
+ for (i = 0; (int) i < ira_pressure_classes_num; i++)
+ {
+ enum reg_class pressure_class;
+
+ pressure_class = ira_pressure_classes[i];
+ if (BB_DATA (bb)->max_reg_pressure[pressure_class] == 0)
+ continue;
+
+ fprintf (dump_file, " %s=%d\n", reg_class_names[pressure_class],
+ BB_DATA (bb)->max_reg_pressure[pressure_class]);
+ }
+ }
+ fprintf (dump_file, "\n");
+ }
+
/* Top level routine to perform one code hoisting (aka unification) pass
Return nonzero if a change was made. */
*************** one_code_hoisting_pass (void)
*** 3167,3172 ****
--- 3462,3477 ----
doing_code_hoisting_p = true;
+ /* Calculate register pressure for each basic block. */
+ if (flag_ira_hoist_pressure)
+ {
+ regstat_init_n_sets_and_refs ();
+ ira_set_pseudo_classes (false, dump_file);
+ alloc_aux_for_blocks (sizeof (struct bb_data));
+ calculate_bb_reg_pressure ();
+ regstat_free_n_sets_and_refs ();
+ }
+
/* We need alias. */
init_alias_analysis ();
*************** one_code_hoisting_pass (void)
*** 3187,3192 ****
--- 3492,3502 ----
free_code_hoist_mem ();
}
+ if (flag_ira_hoist_pressure)
+ {
+ free_aux_for_blocks ();
+ free_reg_info ();
+ }
free_hash_table (&expr_hash_table);
free_gcse_mem ();
obstack_free (&gcse_obstack, NULL);
diff -x .svn -rpN wc1/gcc/haifa-sched.c wc2/gcc/haifa-sched.c
*** wc1/gcc/haifa-sched.c 2012-10-12 15:13:41.582914633 +0800
--- wc2/gcc/haifa-sched.c 2012-10-16 15:19:35.243118188 +0800
*************** sched_init (void)
*** 6629,6635 ****
/* We need info about pseudos for rtl dumps about pseudo
classes and costs. */
regstat_init_n_sets_and_refs ();
! ira_set_pseudo_classes (sched_verbose ? sched_dump : NULL);
sched_regno_pressure_class
= (enum reg_class *) xmalloc (max_regno * sizeof (enum reg_class));
for (i = 0; i < max_regno; i++)
--- 6629,6635 ----
/* We need info about pseudos for rtl dumps about pseudo
classes and costs. */
regstat_init_n_sets_and_refs ();
! ira_set_pseudo_classes (true, sched_verbose ? sched_dump : NULL);
sched_regno_pressure_class
= (enum reg_class *) xmalloc (max_regno * sizeof (enum reg_class));
for (i = 0; i < max_regno; i++)
diff -x .svn -rpN wc1/gcc/ira.c wc2/gcc/ira.c
*** wc1/gcc/ira.c 2012-10-12 15:13:41.687618514 +0800
--- wc2/gcc/ira.c 2012-10-16 15:19:35.246119338 +0800
*************** ira (FILE *f)
*** 4183,4189 ****
crtl->is_leaf = leaf_function_p ();
if (resize_reg_info () && flag_ira_loop_pressure)
! ira_set_pseudo_classes (ira_dump_file);
rebuild_p = update_equiv_regs ();
--- 4183,4189 ----
crtl->is_leaf = leaf_function_p ();
if (resize_reg_info () && flag_ira_loop_pressure)
! ira_set_pseudo_classes (true, ira_dump_file);
rebuild_p = update_equiv_regs ();
diff -x .svn -rpN wc1/gcc/ira-costs.c wc2/gcc/ira-costs.c
*** wc1/gcc/ira-costs.c 2012-10-12 15:13:41.694110750 +0800
--- wc2/gcc/ira-costs.c 2012-10-16 15:19:35.246119338 +0800
*************** ira_costs (void)
*** 2048,2056 ****
ira_free (total_allocno_costs);
}
! /* Entry function which defines classes for pseudos. */
void
! ira_set_pseudo_classes (FILE *dump_file)
{
allocno_p = false;
internal_flag_ira_verbose = flag_ira_verbose;
--- 2048,2057 ----
ira_free (total_allocno_costs);
}
! /* Entry function which defines classes for pseudos.
! Set pseudo_classes_defined_p only if DEFINE_PSEUDO_CLASSES is true. */
void
! ira_set_pseudo_classes (bool define_pseudo_classes, FILE *dump_file)
{
allocno_p = false;
internal_flag_ira_verbose = flag_ira_verbose;
*************** ira_set_pseudo_classes (FILE *dump_file)
*** 2059,2065 ****
initiate_regno_cost_classes ();
find_costs_and_classes (dump_file);
finish_regno_cost_classes ();
! pseudo_classes_defined_p = true;
finish_costs ();
}
--- 2060,2068 ----
initiate_regno_cost_classes ();
find_costs_and_classes (dump_file);
finish_regno_cost_classes ();
! if (define_pseudo_classes)
! pseudo_classes_defined_p = true;
!
finish_costs ();
}
diff -x .svn -rpN wc1/gcc/ira.h wc2/gcc/ira.h
*** wc1/gcc/ira.h 2012-10-12 15:13:41.691617905 +0800
--- wc2/gcc/ira.h 2012-10-16 15:19:35.246119338 +0800
***************
*** 1,6 ****
/* Communication between the Integrated Register Allocator (IRA) and
the rest of the compiler.
! Copyright (C) 2006, 2007, 2008, 2009, 2010
Free Software Foundation, Inc.
Contributed by Vladimir Makarov <[email protected]>.
--- 1,6 ----
/* Communication between the Integrated Register Allocator (IRA) and
the rest of the compiler.
! Copyright (C) 2006, 2007, 2008, 2009, 2010, 2012
Free Software Foundation, Inc.
Contributed by Vladimir Makarov <[email protected]>.
*************** extern void ira_init (void);
*** 131,137 ****
extern void ira_finish_once (void);
extern void ira_setup_eliminable_regset (void);
extern rtx ira_eliminate_regs (rtx, enum machine_mode);
! extern void ira_set_pseudo_classes (FILE *);
extern void ira_implicitly_set_insn_hard_regs (HARD_REG_SET *);
extern void ira_sort_regnos_for_alter_reg (int *, int, unsigned int *);
--- 131,137 ----
extern void ira_finish_once (void);
extern void ira_setup_eliminable_regset (void);
extern rtx ira_eliminate_regs (rtx, enum machine_mode);
! extern void ira_set_pseudo_classes (bool, FILE *);
extern void ira_implicitly_set_insn_hard_regs (HARD_REG_SET *);
extern void ira_sort_regnos_for_alter_reg (int *, int, unsigned int *);
diff -x .svn -rpN wc1/gcc/loop-invariant.c wc2/gcc/loop-invariant.c
*** wc1/gcc/loop-invariant.c 2012-10-12 15:13:41.674117845 +0800
--- wc2/gcc/loop-invariant.c 2012-10-16 15:19:35.246119338 +0800
***************
*** 1,5 ****
/* RTL-level loop invariant motion.
! Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010
Free Software Foundation, Inc.
This file is part of GCC.
--- 1,5 ----
/* RTL-level loop invariant motion.
! Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2012
Free Software Foundation, Inc.
This file is part of GCC.
*************** move_loop_invariants (void)
*** 1915,1921 ****
{
df_analyze ();
regstat_init_n_sets_and_refs ();
! ira_set_pseudo_classes (dump_file);
calculate_loop_reg_pressure ();
regstat_free_n_sets_and_refs ();
}
--- 1915,1921 ----
{
df_analyze ();
regstat_init_n_sets_and_refs ();
! ira_set_pseudo_classes (true, dump_file);
calculate_loop_reg_pressure ();
regstat_free_n_sets_and_refs ();
}
diff -x .svn -rpN wc1/gcc/regmove.c wc2/gcc/regmove.c
*** wc1/gcc/regmove.c 2012-10-12 15:13:41.659610135 +0800
--- wc2/gcc/regmove.c 2012-10-16 15:19:35.246119338 +0800
***************
*** 1,6 ****
/* Move registers around to reduce number of move instructions needed.
Copyright (C) 1987, 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998,
! 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
Free Software Foundation, Inc.
This file is part of GCC.
--- 1,7 ----
/* Move registers around to reduce number of move instructions needed.
Copyright (C) 1987, 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998,
! 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
! 2012
Free Software Foundation, Inc.
This file is part of GCC.
*************** regmove_optimize (void)
*** 1237,1243 ****
regstat_compute_ri ();
if (flag_ira_loop_pressure)
! ira_set_pseudo_classes (dump_file);
regno_src_regno = XNEWVEC (int, nregs);
for (i = nregs; --i >= 0; )
--- 1238,1244 ----
regstat_compute_ri ();
if (flag_ira_loop_pressure)
! ira_set_pseudo_classes (true, dump_file);
regno_src_regno = XNEWVEC (int, nregs);
for (i = nregs; --i >= 0; )
diff -x .svn -rpN wc1/gcc/testsuite/gcc.dg/hoist-register-pressure.c
wc2/gcc/testsuite/gcc.dg/hoist-register-pressure.c
*** wc1/gcc/testsuite/gcc.dg/hoist-register-pressure.c 1970-01-01
08:00:00.000000000 +0800
--- wc2/gcc/testsuite/gcc.dg/hoist-register-pressure.c 2012-10-16
15:19:35.246119338 +0800
***************
*** 0 ****
--- 1,31 ----
+ /* { dg-options "-Os -fdump-rtl-hoist" } */
+ /* { dg-final { scan-rtl-dump "PRE/HOIST: end of bb .* copying expression"
"hoist" } } */
+
+ #define BUF 100
+ int a[BUF];
+
+ void com (int);
+ void bar (int);
+
+ int foo (int x, int y, int z)
+ {
+ /* "x+y" won't be hoisted if "-fira-hoist-pressure" is disabled,
+ because its rtx_cost is too small. */
+ if (z)
+ {
+ a[1] = a[0] + a[2];
+ a[2] = a[1] + a[3];
+ a[3] = a[2] + a[4];
+ a[4] = a[3] + a[5];
+ a[5] = a[4] + a[6];
+ a[6] = a[5] + a[7];
+ a[7] = a[6] + a[8];
+ com (x+y);
+ }
+ else
+ {
+ bar (x+y);
+ }
+
+ return 0;
+ }
Index: gcc/gcse.c
===================================================================
--- gcc/gcse.c (revision 192194)
+++ gcc/gcse.c (working copy)
@@ -1,6 +1,6 @@
/* Partial redundancy elimination / Hoisting for RTL.
Copyright (C) 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
- 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc.
+ 2006, 2007, 2008, 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
This file is part of GCC.
@@ -460,7 +460,7 @@ static void alloc_code_hoist_mem (int, int);
static void free_code_hoist_mem (void);
static void compute_code_hoist_vbeinout (void);
static void compute_code_hoist_data (void);
-static int hoist_expr_reaches_here_p (basic_block, int, basic_block, char *,
+static int hoist_expr_reaches_here_p (basic_block, int, basic_block, sbitmap,
int, int *);
static int hoist_code (void);
static int one_code_hoisting_pass (void);
@@ -2842,7 +2842,7 @@ compute_code_hoist_data (void)
static int
hoist_expr_reaches_here_p (basic_block expr_bb, int expr_index, basic_block bb,
- char *visited, int distance, int *bb_size)
+ sbitmap visited, int distance, int *bb_size)
{
edge pred;
edge_iterator ei;
@@ -2863,7 +2863,8 @@ hoist_expr_reaches_here_p (basic_block expr_bb, in
if (visited == NULL)
{
visited_allocated_locally = 1;
- visited = XCNEWVEC (char, last_basic_block);
+ visited = sbitmap_alloc (last_basic_block);
+ sbitmap_zero (visited);
}
FOR_EACH_EDGE (pred, ei, bb->preds)
@@ -2874,7 +2875,7 @@ hoist_expr_reaches_here_p (basic_block expr_bb, in
break;
else if (pred_bb == expr_bb)
continue;
- else if (visited[pred_bb->index])
+ else if (TEST_BIT (visited, pred_bb->index))
continue;
else if (! TEST_BIT (transp[pred_bb->index], expr_index))
@@ -2883,14 +2884,14 @@ hoist_expr_reaches_here_p (basic_block expr_bb, in
/* Not killed. */
else
{
- visited[pred_bb->index] = 1;
+ SET_BIT (visited, pred_bb->index);
if (! hoist_expr_reaches_here_p (expr_bb, expr_index, pred_bb,
visited, distance, bb_size))
break;
}
}
if (visited_allocated_locally)
- free (visited);
+ sbitmap_free (visited);
return (pred == NULL);
}