Hello Richard:

Below are the performance numbers on CPU 2017 benchmarks with and without 
register pressure
changes for code sinking.

INT Benchmarks:

With register pressure code sinking changes:


                       Estimated                       Estimated
                 Base     Base        Base        Peak     Peak        Peak
Benchmarks       Copies  Run Time     Rate        Copies  Run Time     Rate 
--------------- -------  ---------  ---------    -------  ---------  ---------

500.perlbench_r       1        363       4.39  *
502.gcc_r             1        225       6.30  *
505.mcf_r             1        289       5.59  *
520.omnetpp_r         1        315       4.17  *
523.xalancbmk_r       1        238       4.43  *
525.x264_r            1        180       9.75  *
531.deepsjeng_r       1        291       3.94  *
541.leela_r           1        463       3.58  *
548.exchange2_r       1        222      11.8   *
557.xz_r              1        323       3.34  *
 Est. SPECrate(R)2017_int_base           5.24

Trunk without any register pressure code sinking changes:

500.perlbench_r       1        358       4.44  *
502.gcc_r             1        225       6.28  *
505.mcf_r             1        286       5.64  *
520.omnetpp_r         1        310       4.24  *
523.xalancbmk_r       1        235       4.50  *
525.x264_r            1        181       9.69  *
531.deepsjeng_r       1        291       3.94  *
541.leela_r           1        465       3.56  *
548.exchange2_r       1        219      11.9   *
557.xz_r              1        325       3.32  *
 Est. SPECrate(R)2017_int_base           5.26

FP benchmarks:

With register pressure code sinking changes:

503.bwaves_r          1        187      53.5   * 
507.cactuBSSN_r       1        235       5.39  *
508.namd_r            1        216       4.40  *
510.parest_r          1        340       7.69  *
511.povray_r          1        488       4.78  *
519.lbm_r             1        128       8.21  *
521.wrf_r             1        269       8.34  *
526.blender_r         1        311       4.89  *
527.cam4_r            1        314       5.56  *
538.imagick_r         1        228      10.9   *
544.nab_r             1        293       5.74  *
549.fotonik3d_r       1        237      16.4   *
554.roms_r            1        269       5.90  *
 Est. SPECrate(R)2017_fp_base            7.97

Trunk Without register pressure changes for code sinking:

503.bwaves_r          1        188      53.4   *
507.cactuBSSN_r       1        242       5.24  *
508.namd_r            1        215       4.42  *
510.parest_r          1        333       7.86  *
511.povray_r          1        481       4.85  *
519.lbm_r             1        128       8.22  *
521.wrf_r             1        269       8.34  *
526.blender_r         1        309       4.93  *
527.cam4_r            1        313       5.58  *
538.imagick_r         1        227      11.0   *
544.nab_r             1        291       5.79  *
549.fotonik3d_r       1        235      16.6   *
554.roms_r            1        268       5.92  *
 Est. SPECrate(R)2017_fp_base            8.00


Thanks & Regards
Ajit

On 03/11/23 8:24 pm, Ajit Agarwal wrote:
> Hello Richard:
> 
> 
> On 03/11/23 7:06 pm, Richard Biener wrote:
>> On Fri, Nov 3, 2023 at 11:20 AM Ajit Agarwal <aagar...@linux.ibm.com> wrote:
>>>
>>> Hello Richard:
>>>
>>> On 03/11/23 12:51 pm, Richard Biener wrote:
>>>> On Thu, Nov 2, 2023 at 9:50 PM Ajit Agarwal <aagar...@linux.ibm.com> wrote:
>>>>>
>>>>> Hello All:
>>>>>
>> [...]
>>>>>
>>>>> High register pressure region is the region where there are live-in of
>>>>> early blocks that has been modified by the early block. If there are
>>>>> modification of the variables in best block that are live-in in early
>>>>> block that are live-out of best block.
>>>>
>>>> ?!  Parse error.
>>>>
>>>
>>> I didnt understand what you meant here. Please suggest.
>>
>> I can't even guess what that paragraph means.  It fails at a
>> parsing level already, I can't even start to reason about what
>> the sentences mean.
> 
> Sorry for that I will modify.
> 
>>
>>>>> Bootstrapped and regtested on powerpc64-linux-gnu.
>>>>
>>>> What's the effect on code generation?
>>>>
>>>> Note that live is a quadratic problem while sinking was not.  You
>>>> are effectively making the pass unfit for -O1.
>>>>
>>>> You are computing "liveness" on GIMPLE where within EBBs there
>>>> isn't really any particular order of stmts, so it's kind of a garbage
>>>> heuristic.  Likewise you are not computing the effect that moving
>>>> a stmt has on liveness as far as I can see but you are just identifying
>>>> some odd metrics (I don't really understand them) to rank blocks,
>>>> not even taking the register file size into account.
>>>
>>>
>>> if the live out of best_bb  <= live out of early_bb, that shows
>>> that there are modification in best_bb.
>>
>> Hm?  Do you maybe want to say that if live_out (bb) < live_in (bb)
>> then some variables die during the execution of bb?
> 
> live_out (bb) < live_in(bb) means in bb there may be KILL (Variables)
> and there are more GEN (Variables).
> 
>   Otherwise,
>> if live_out (early) > live_out (best) then somewhere on the path
>> from early to best some variables die.
>>
> 
> If live_out (early) > live_out (best) means there are more GEN (Variables)
> between path from early to best.
> 
> 
>>> Then it's
>>> safer to move statements in best_bb as there are lesser interfering
>>> live variables in best_bb.
>>
>> consider a stmt
>>
>>  a = b + c;
>>
>> where b and c die at the definition of a.  Then moving the stmt
>> down from early_bb means you increase live_out (early_bb) by
>> one.  So why's that "safer" then?  Of course live_out (best_bb)
>> also increases by two then.
>>
> 
> If b and c die at the definition of a and generates a live_in(early_bb)
> would be live_out(early_bb) - 2 + 1.
> 
> the moving the stmt from early_bb down to best_bb increases live_out(early_bb)
> by one and live_out (best_bb) depends on the LIVEIN(for all successors of 
> best_bb)
> which may be same even if we move down.
> 
> There are chances that live_out (best_bb) greater if for all successors of 
> best_bb there are more GEN ( variables). If live_out (best_bb) is less
> means there more KILL (Variables) in successors of best_bb.
> 
> With my heuristics live_out (best_bb ) > live_out (early_bb) then we dont do
> code motion as there are chances of more interfering live ranges. If 
> liveout(best_bb)
> <= liveout (early_bb) then we do code motion as there is there are more 
> KILL(for
> all successors of best_bb) and there is less chance of interfering live 
> ranges.
> 
> With moving down above stmt from early_bb to best_bb increases 
> live_out(early_bb)
> by one but live_out(best_bb) may be remains. If live_out (early_bb) increase 
> by 1
> but if it becomes > live_out(best_bb) then we dont do code motion if we have 
> more GEN (Variables) in best_bb otherewise its safer to do 
> code motion.
> 
> for above statement a = b + c dies b and c and generates a in early_bb then
> liveout(early_bb) increases by 1. If before moving if liveout (best_bb) is 10
> and then liveout (early_bb) becomes > 10 then we dont do code motion otherwise
> we do code motion.
> 
> 
> 
> 
> 
>>> if there are lesser live out in best_bb, there is lesser chance
>>> of interfering live ranges and hence moving statements in best_bb
>>> will not increase register pressure.
>>>
>>> If the liveout of best_bb is greater than live-out of early_bb,
>>> moving statements in best_bb will increase chances of more interfering
>>> live ranges and hence increase in register pressure.
>>>
>>> This is how the heuristics is defined.
>>
>> I don't think they will work.  Do you have any numbers?
>>
> 
> My heuristics will work as mentioned above. I will  run the spec benchmarks
> and will able to give performance numbers.
> 
> Thanks & Regards
> Ajit
>  
>>>
>>>>
>>>> You are replacing the hot/cold heuristic.
>>>
>>>>
>>>> IMHO the sinking pass is the totally wrong place to do anything
>>>> about register pressure.  You are trying to solve a scheduling
>>>> problem by just looking at a single stmt.
>>>>
>>>
>>> bb->count from profile.cc are prone to errors as you have
>>> mentioned in previous mails. Main bottlenecks with code
>>> motion is increase in register pressure as that counts to
>>> spills in later phases of the compiler backend.
>>>
>>> Calculation of best_bb based of immediate dominator should
>>> consider register pressure instead of hold cold regions as that
>>> would effect code generation.
>>>
>>> If there is increase in register pressure with code motion and if
>>> we are moving into colder regions, that wont improve code generations.
>>>
>>> Hold/cold should be criteria but not the improved criteria with
>>> code motion.
>>>
>>> We should consider register pressure with code motion than hot/cold
>>> regions.
>>>
>>> Thanks & Regards
>>> Ajit
>>>
>>>> Richard.
>>>>
>>>>> Thanks & Regards
>>>
>>>
>>>>> Ajit
>>>>>
>>>>> tree-optimization: Add register pressure heuristics
>>>>>
>>>>> Currently code sinking heuristics are based on profile data like
>>>>> basic block count and sink frequency threshold. We have removed
>>>>> such heuristics to add register pressure heuristics based on
>>>>> live-in and live-out of early blocks and immediate dominator of
>>>>> use blocks.
>>>>>
>>>>> High register pressure region is the region where there are live-in of
>>>>> early blocks that has been modified by the early block. If there are
>>>>> modification of the variables in best block that are live-in in early
>>>>> block that are live-out of best block.
>>>>>
>>>>> 2023-11-03  Ajit Kumar Agarwal  <aagar...@linux.ibm.com>
>>>>>
>>>>> gcc/ChangeLog:
>>>>>
>>>>>         * tree-ssa-sink.cc (statement_sink_location): Add tree_live_info_p
>>>>>         as paramters.
>>>>>         (sink_code_in_bb): Ditto.
>>>>>         (select_best_block): Add register pressure heuristics to select
>>>>>         the best blocks in the immediate dominator for same loop nest 
>>>>> depth.
>>>>>         (execute): Add live range analysis.
>>>>>         (additional_var_map): New function.
>>>>>         * tree-ssa-live.cc (set_var_live_on_entry): Add virtual operand
>>>>>         tests on ssa_names.
>>>>>         (verify_live_on_entry): Ditto.
>>>>>
>>>>> gcc/testsuite/ChangeLog:
>>>>>
>>>>>         * gcc.dg/tree-ssa/ssa-sink-21.c: New test.
>>>>>         * gcc.dg/tree-ssa/ssa-sink-22.c: New test.
>>>>> ---
>>>>>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c | 15 ++++
>>>>>  gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c | 19 +++++
>>>>>  gcc/tree-ssa-live.cc                        | 11 ++-
>>>>>  gcc/tree-ssa-sink.cc                        | 93 ++++++++++++++-------
>>>>>  4 files changed, 104 insertions(+), 34 deletions(-)
>>>>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
>>>>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c
>>>>>
>>>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c 
>>>>> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
>>>>> new file mode 100644
>>>>> index 00000000000..d3b79ca5803
>>>>> --- /dev/null
>>>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
>>>>> @@ -0,0 +1,15 @@
>>>>> +/* { dg-do compile } */
>>>>> +/* { dg-options "-O2 -fdump-tree-sink-stats" } */
>>>>> +void bar();
>>>>> +int j;
>>>>> +void foo(int a, int b, int c, int d, int e, int f)
>>>>> +{
>>>>> +  int l;
>>>>> +  l = a + b + c + d +e + f;
>>>>> +  if (a != 5)
>>>>> +    {
>>>>> +      bar();
>>>>> +      j = l;
>>>>> +    }
>>>>> +}
>>>>> +/* { dg-final { scan-tree-dump 
>>>>> {l_12\s+=\s+_4\s+\+\s+f_11\(D\);\n\s+bar\s+\(\)} sink1 } } */
>>>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c 
>>>>> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c
>>>>> new file mode 100644
>>>>> index 00000000000..84e7938c54f
>>>>> --- /dev/null
>>>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-22.c
>>>>> @@ -0,0 +1,19 @@
>>>>> +/* { dg-do compile } */
>>>>> +/* { dg-options "-O2 -fdump-tree-sink-stats" } */
>>>>> +void bar();
>>>>> +int j, x;
>>>>> +void foo(int a, int b, int c, int d, int e, int f)
>>>>> +{
>>>>> +  int l;
>>>>> +  l = a + b + c + d +e + f;
>>>>> +  if (a != 5)
>>>>> +    {
>>>>> +      bar();
>>>>> +      if (b != 3)
>>>>> +        x = 3;
>>>>> +      else
>>>>> +        x = 5;
>>>>> +      j = l;
>>>>> +    }
>>>>> +}
>>>>> +/* { dg-final { scan-tree-dump 
>>>>> {l_13\s+=\s+_4\s+\+\s+f_12\(D\);\n\s+bar\s+\(\)} sink1 } } */
>>>>> diff --git a/gcc/tree-ssa-live.cc b/gcc/tree-ssa-live.cc
>>>>> index f06daf23035..998fe588278 100644
>>>>> --- a/gcc/tree-ssa-live.cc
>>>>> +++ b/gcc/tree-ssa-live.cc
>>>>> @@ -1141,7 +1141,8 @@ set_var_live_on_entry (tree ssa_name, 
>>>>> tree_live_info_p live)
>>>>>      def_bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
>>>>>
>>>>>    /* An undefined local variable does not need to be very alive.  */
>>>>> -  if (ssa_undefined_value_p (ssa_name, false))
>>>>> +  if (virtual_operand_p (ssa_name)
>>>>> +      || ssa_undefined_value_p (ssa_name, false))
>>>>>      return;
>>>>>
>>>>>    /* Visit each use of SSA_NAME and if it isn't in the same block as the 
>>>>> def,
>>>>> @@ -1540,7 +1541,6 @@ debug (tree_live_info_d *ptr)
>>>>>
>>>>>
>>>>>  /* Verify that the info in LIVE matches the current cfg.  */
>>>>> -
>>>>>  static void
>>>>>  verify_live_on_entry (tree_live_info_p live)
>>>>>  {
>>>>> @@ -1569,11 +1569,13 @@ verify_live_on_entry (tree_live_info_p live)
>>>>>           tree d = NULL_TREE;
>>>>>           bitmap loe;
>>>>>           var = partition_to_var (map, i);
>>>>> +         if (var == NULL_TREE)
>>>>> +           continue;
>>>>>           stmt = SSA_NAME_DEF_STMT (var);
>>>>>           tmp = gimple_bb (stmt);
>>>>> +
>>>>>           if (SSA_NAME_VAR (var))
>>>>>             d = ssa_default_def (cfun, SSA_NAME_VAR (var));
>>>>> -
>>>>>           loe = live_on_entry (live, e->dest);
>>>>>           if (loe && bitmap_bit_p (loe, i))
>>>>>             {
>>>>> @@ -1614,7 +1616,8 @@ verify_live_on_entry (tree_live_info_p live)
>>>>>               {
>>>>>                 /* An undefined local variable does not need to be very
>>>>>                    alive.  */
>>>>> -               if (ssa_undefined_value_p (var, false))
>>>>> +               if (virtual_operand_p (var)
>>>>> +                   || ssa_undefined_value_p (var, false))
>>>>>                   continue;
>>>>>
>>>>>                 /* The only way this var shouldn't be marked live on 
>>>>> entry is
>>>>> diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc
>>>>> index a360c5cdd6e..d0c9ef1ab86 100644
>>>>> --- a/gcc/tree-ssa-sink.cc
>>>>> +++ b/gcc/tree-ssa-sink.cc
>>>>> @@ -176,6 +176,9 @@ nearest_common_dominator_of_uses (def_operand_p 
>>>>> def_p, bool *debug_stmts)
>>>>>     tree, return the best basic block between them (inclusive) to place
>>>>>     statements.
>>>>>
>>>>> +   The best basic block should be an immediate dominator of
>>>>> +   best basic block if we've moved to same loop nest.
>>>>> +
>>>>>     We want the most control dependent block in the shallowest loop nest.
>>>>>
>>>>>     If the resulting block is in a shallower loop nest, then use it.  Else
>>>>> @@ -191,24 +194,23 @@ nearest_common_dominator_of_uses (def_operand_p 
>>>>> def_p, bool *debug_stmts)
>>>>>  static basic_block
>>>>>  select_best_block (basic_block early_bb,
>>>>>                    basic_block late_bb,
>>>>> -                  gimple *stmt)
>>>>> +                  gimple *stmt,
>>>>> +                  tree_live_info_p &live)
>>>>>  {
>>>>>    basic_block best_bb = late_bb;
>>>>>    basic_block temp_bb = late_bb;
>>>>> -  int threshold;
>>>>>
>>>>>    while (temp_bb != early_bb)
>>>>>      {
>>>>>        /* If we've moved into a lower loop nest, then that becomes
>>>>>          our best block.  */
>>>>> -      if (bb_loop_depth (temp_bb) < bb_loop_depth (best_bb))
>>>>> +      if (bb_loop_depth (temp_bb) <= bb_loop_depth (best_bb))
>>>>>         best_bb = temp_bb;
>>>>>
>>>>>        /* Walk up the dominator tree, hopefully we'll find a shallower
>>>>>          loop nest.  */
>>>>>        temp_bb = get_immediate_dominator (CDI_DOMINATORS, temp_bb);
>>>>>      }
>>>>> -
>>>>>    /* Placing a statement before a setjmp-like function would be invalid
>>>>>       (it cannot be reevaluated when execution follows an abnormal edge).
>>>>>       If we selected a block with abnormal predecessors, just punt.  */
>>>>> @@ -233,24 +235,36 @@ select_best_block (basic_block early_bb,
>>>>>        && !dominated_by_p (CDI_DOMINATORS, best_bb->loop_father->latch, 
>>>>> best_bb))
>>>>>      return early_bb;
>>>>>
>>>>> -  /* Get the sinking threshold.  If the statement to be moved has memory
>>>>> -     operands, then increase the threshold by 7% as those are even more
>>>>> -     profitable to avoid, clamping at 100%.  */
>>>>> -  threshold = param_sink_frequency_threshold;
>>>>> -  if (gimple_vuse (stmt) || gimple_vdef (stmt))
>>>>> -    {
>>>>> -      threshold += 7;
>>>>> -      if (threshold > 100)
>>>>> -       threshold = 100;
>>>>> -    }
>>>>> +  int best_bb_liveout_cnt
>>>>> +    = bitmap_count_bits (&live->liveout[best_bb->index]);
>>>>> +  int early_bb_liveout_cnt
>>>>> +     = bitmap_count_bits (&live->liveout[early_bb->index]);
>>>>> +  int early_livein_cnt
>>>>> +     = bitmap_count_bits (&live->livein[early_bb->index]);
>>>>> +
>>>>> +  /* High register pressure region is the region where there are live-in 
>>>>> of
>>>>> +     early blocks that has been modified by the early block. If there are
>>>>> +     modification of the variables in best block that are live-in in 
>>>>> early
>>>>> +     block that are live-out of best block.  */
>>>>> +  bool live_in_rgn = (early_livein_cnt != 0
>>>>> +                     && early_bb_liveout_cnt <= early_livein_cnt);
>>>>> +
>>>>> +  bool high_reg_pressure_rgn
>>>>> +    = ((best_bb_liveout_cnt != 0 || live_in_rgn)
>>>>> +       && best_bb_liveout_cnt <= early_bb_liveout_cnt);
>>>>>
>>>>>    /* If BEST_BB is at the same nesting level, then require it to have
>>>>> -     significantly lower execution frequency to avoid gratuitous 
>>>>> movement.  */
>>>>> +     significantly high register pressure region to avoid gratuitous
>>>>> +      movement.  */
>>>>>    if (bb_loop_depth (best_bb) == bb_loop_depth (early_bb)
>>>>> -      /* If result of comparsion is unknown, prefer EARLY_BB.
>>>>> -        Thus use !(...>=..) rather than (...<...)  */
>>>>> -      && !(best_bb->count * 100 >= early_bb->count * threshold))
>>>>> -    return best_bb;
>>>>> +     && high_reg_pressure_rgn)
>>>>> +    {
>>>>> +     /* Avoid sinking to immediate dominator if the statement to be moved
>>>>> +       has memory operand and same loop nest.  */
>>>>> +      if (best_bb != late_bb && gimple_vuse (stmt))
>>>>> +       return late_bb;
>>>>> +      return best_bb;
>>>>> +    }
>>>>>
>>>>>    /* No better block found, so return EARLY_BB, which happens to be the
>>>>>       statement's original block.  */
>>>>> @@ -265,7 +279,8 @@ select_best_block (basic_block early_bb,
>>>>>  static bool
>>>>>  statement_sink_location (gimple *stmt, basic_block frombb,
>>>>>                          gimple_stmt_iterator *togsi, bool *zero_uses_p,
>>>>> -                        virtual_operand_live &vop_live)
>>>>> +                        virtual_operand_live &vop_live,
>>>>> +                        tree_live_info_p &live)
>>>>>  {
>>>>>    gimple *use;
>>>>>    use_operand_p one_use = NULL_USE_OPERAND_P;
>>>>> @@ -413,7 +428,7 @@ statement_sink_location (gimple *stmt, basic_block 
>>>>> frombb,
>>>>>        if (!dominated_by_p (CDI_DOMINATORS, commondom, frombb))
>>>>>         return false;
>>>>>
>>>>> -      commondom = select_best_block (frombb, commondom, stmt);
>>>>> +      commondom = select_best_block (frombb, commondom, stmt, live);
>>>>>
>>>>>        if (commondom == frombb)
>>>>>         return false;
>>>>> @@ -430,19 +445,17 @@ statement_sink_location (gimple *stmt, basic_block 
>>>>> frombb,
>>>>>             continue;
>>>>>           break;
>>>>>         }
>>>>> +
>>>>>        use = USE_STMT (one_use);
>>>>>
>>>>>        if (gimple_code (use) != GIMPLE_PHI)
>>>>>         {
>>>>> -         sinkbb = select_best_block (frombb, gimple_bb (use), stmt);
>>>>> +         sinkbb = select_best_block (frombb, gimple_bb (use), stmt, 
>>>>> live);
>>>>>
>>>>>           if (sinkbb == frombb)
>>>>>             return false;
>>>>>
>>>>> -         if (sinkbb == gimple_bb (use))
>>>>> -           *togsi = gsi_for_stmt (use);
>>>>> -         else
>>>>> -           *togsi = gsi_after_labels (sinkbb);
>>>>> +         *togsi = gsi_after_labels (sinkbb);
>>>>>
>>>>>           return true;
>>>>>         }
>>>>> @@ -454,7 +467,7 @@ statement_sink_location (gimple *stmt, basic_block 
>>>>> frombb,
>>>>>    if (!sinkbb)
>>>>>      return false;
>>>>>
>>>>> -  sinkbb = select_best_block (frombb, sinkbb, stmt);
>>>>> +  sinkbb = select_best_block (frombb, sinkbb, stmt, live);
>>>>>    if (!sinkbb || sinkbb == frombb)
>>>>>      return false;
>>>>>
>>>>> @@ -643,7 +656,8 @@ sink_common_stores_to_bb (basic_block bb)
>>>>>  /* Perform code sinking on BB */
>>>>>
>>>>>  static unsigned
>>>>> -sink_code_in_bb (basic_block bb, virtual_operand_live &vop_live)
>>>>> +sink_code_in_bb (basic_block bb, virtual_operand_live &vop_live,
>>>>> +               tree_live_info_p &live)
>>>>>  {
>>>>>    gimple_stmt_iterator gsi;
>>>>>    edge_iterator ei;
>>>>> @@ -670,7 +684,8 @@ sink_code_in_bb (basic_block bb, virtual_operand_live 
>>>>> &vop_live)
>>>>>        gimple_stmt_iterator togsi;
>>>>>        bool zero_uses_p;
>>>>>
>>>>> -      if (!statement_sink_location (stmt, bb, &togsi, &zero_uses_p, 
>>>>> vop_live))
>>>>> +      if (!statement_sink_location (stmt, bb, &togsi, &zero_uses_p,
>>>>> +                                   vop_live, live))
>>>>>         {
>>>>>           gimple_stmt_iterator saved = gsi;
>>>>>           if (!gsi_end_p (gsi))
>>>>> @@ -814,6 +829,15 @@ private:
>>>>>    bool unsplit_edges;
>>>>>  }; // class pass_sink_code
>>>>>
>>>>> +void additional_var_map (var_map &map)
>>>>> +{
>>>>> +  map->bmp_bbs = BITMAP_ALLOC (NULL);
>>>>> +  map->outofssa_p = false;
>>>>> +  basic_block bb;
>>>>> +  FOR_EACH_BB_FN (bb, cfun)
>>>>> +    bitmap_set_bit (map->bmp_bbs, bb->index);
>>>>> +}
>>>>> +
>>>>>  unsigned int
>>>>>  pass_sink_code::execute (function *fun)
>>>>>  {
>>>>> @@ -827,12 +851,21 @@ pass_sink_code::execute (function *fun)
>>>>>    calculate_dominance_info (CDI_DOMINATORS);
>>>>>
>>>>>    virtual_operand_live vop_live;
>>>>> +  var_map map = init_var_map (num_ssa_names);
>>>>> +  additional_var_map (map);
>>>>> +  tree_live_info_p live = calculate_live_ranges (map, true);
>>>>>
>>>>>    int *rpo = XNEWVEC (int, n_basic_blocks_for_fn (cfun));
>>>>>    int n = inverted_rev_post_order_compute (fun, rpo);
>>>>> +
>>>>>    for (int i = 0; i < n; ++i)
>>>>> -    todo |= sink_code_in_bb (BASIC_BLOCK_FOR_FN (fun, rpo[i]), vop_live);
>>>>> +    todo |= sink_code_in_bb (BASIC_BLOCK_FOR_FN (fun, rpo[i]), vop_live, 
>>>>> live);
>>>>>    free (rpo);
>>>>> +  if (live)
>>>>> +    delete_tree_live_info (live);
>>>>> +
>>>>> +  if (map)
>>>>> +    delete_var_map (map);
>>>>>
>>>>>    statistics_counter_event (fun, "Sunk statements", sink_stats.sunk);
>>>>>    statistics_counter_event (fun, "Commoned stores", sink_stats.commoned);
>>>>> --
>>>>> 2.39.3
>>>>>
>>>>>

Reply via email to