On Fri, Aug 23, 2019 at 4:27 PM Kewen.Lin <li...@linux.ibm.com> wrote:
>
> Hi Bin
>
> on 2019/8/23 上午10:19, Bin.Cheng wrote:
> > On Thu, Aug 22, 2019 at 3:09 PM Kewen.Lin <li...@linux.ibm.com> wrote:
> >>
> >> Hi Bin,
> >>
> >> on 2019/8/22 下午1:46, Bin.Cheng wrote:
> >>> On Thu, Aug 22, 2019 at 11:18 AM Kewen.Lin <li...@linux.ibm.com> wrote:
> >>>>
> >>>> Hi Bin,
> >>>>
> >>>> Thanks for your time!
> >>>>
> >>>> on 2019/8/21 下午8:32, Bin.Cheng wrote:
> >>>>> On Wed, Aug 14, 2019 at 3:23 PM Kewen.Lin <li...@linux.ibm.com> wrote:
> >>>>>>
> >>>>>> Hi!
> >>>>>>
> >>>>>> Comparing to the previous versions of implementation mainly based on 
> >>>>>> the
> >>>>>> existing IV cands but zeroing the related group/use cost, this new one 
> >>>>>> is based
> >>>>>> on Richard and Segher's suggestion introducing one doloop dedicated IV 
> >>>>>> cand.
> >>>>>>
> >>>>>> Some key points are listed below:
> >>>>>>   1) New field doloop_p in struct iv_cand to indicate doloop dedicated 
> >>>>>> IV cand.
> >>>>>>   2) Special name "doloop" assigned.
> >>>>>>   3) Doloop IV cand with form (niter+1, +, -1)
> >>>>>>   4) For doloop IV cand, no extra one cost like BIV, assign zero cost 
> >>>>>> for step.
> >>>>>>   5) Support may_be_zero (regressed PR is in this case), the base of 
> >>>>>> doloop IV
> >>>>>>      can be COND_EXPR, add handlings in cand_value_at and 
> >>>>>> may_eliminate_iv.
> >>>>>>   6) Add more expr support in force_expr_to_var_cost for reasonable 
> >>>>>> cost
> >>>>>>      calculation on the IV base with may_be_zero (like COND_EXPR).
> >>>>>>   7) Set zero cost when using doloop IV cand for doloop use.
> >>>>>>   8) Add three hooks (should we merge _generic and _address?).
> >>>>>>     *) have_count_reg_decr_p, is to indicate the target has special 
> >>>>>> hardware
> >>>>>>        count register, we shouldn't consider the impact of doloop IV 
> >>>>>> when
> >>>>>>        calculating register pressures.
> >>>>>>     *) doloop_cost_for_generic, is the extra cost when using doloop IV 
> >>>>>> cand for
> >>>>>>        generic type IV use.
> >>>>>>     *) doloop_cost_for_address, is the extra cost when using doloop IV 
> >>>>>> cand for
> >>>>>>        address type IV use.
>
> >> The new patch addressing the comments is attached.
> >> Could you please have a look again?  Thanks in advance!
> > Thanks for working on this.  A bit more nit-pickings.
> >
> > -    add_candidate_1 (data, base, step, important,
> > -                    IP_NORMAL, use, NULL, orig_iv);
> > -  if (ip_end_pos (data->current_loop)
> > +    add_candidate_1 (data, base, step, important, IP_NORMAL, use, NULL, 
> > doloop,
> > +                    orig_iv);
> > +  if (!doloop && ip_end_pos (data->current_loop)
> > Could you add some comments elaborating why ip_end_pos candidate
> > shouldn't be added for doloop case?  Because the increment position is
> > wrong.
> >
> > Also if you make doloop the last default parameter of add_candidate_1,
> > you can save more unnecessary changes to calls to add_candidate?
> >
> > -    cost = get_computation_cost (data, use, cand, false,
> > -                                &inv_vars, NULL, &inv_expr);
> > +    {
> > +      cost = get_computation_cost (data, use, cand, false, &inv_vars, NULL,
> > +                                  &inv_expr);
> > +      if (cand->doloop_p)
> > +       cost += targetm.doloop_cost_for_generic;
> > +    }
> > This adjustment
> >
> >    cost = get_computation_cost (data, use, cand, true,
> >                                &inv_vars, &can_autoinc, &inv_expr);
> >
> > +  if (cand->doloop_p)
> > +    cost += targetm.doloop_cost_for_address;
> > +
> > and this adjustment can be moved into get_computation_cost where all
> > cost adjustments are done.
> >
> > +      /* For doloop candidate/use pair, adjust to zero cost.  */
> > +      if (group->doloop_p && cand->doloop_p)
> > +       cost = no_cost;
> > Note above code handles comparing against zero case and decreases the
> > cost by one (which prefers the same kind candidate as doloop one),
> > it's very possible to have -1 cost for doloop cand here.  how about
> > just set to no_cost if it's positive?  Your call.
> >
> > +/* For the targets which support doloop, to predict whether later RTL 
> > doloop
> > +   transformation will perform on this loop, further detect the doloop use 
> > and
> > +   mark the flag doloop_use_p if predicted.  */
> > +
> > +void
> > +predict_and_process_doloop (struct ivopts_data *data)
> > A better name here? Sorry I don't have another candidate in mind...
> >
> > +  data->doloop_use_p = false;
> > This can be moved to the beginning of above
> > 'predict_and_process_doloop' function.
> >
> > Lastly, could you please add some brief description/comment about
> > doloop handling as a subsection in the file head comment?
> >
> > Otherwise, the ivopt changes look good to me.
> >
> > Thanks,
> > bin
> >
>
> Thanks for your prompt reply!  I've updated the code as your comments,
> the updated version is attached.  Looking forward to your review again.

Sorry to bother.

-      return get_scaled_computation_cost_at (data, at, cost);
+      cost = get_scaled_computation_cost_at (data, at, cost);
+      /* For doloop IV cand, add on the extra cost.  */
+      cost += cand->doloop_p ? targetm.doloop_cost_for_address : 0;
+      return cost;
Here the cost is adjusted after scaling, while:

+  /* For doloop IV cand, add on the extra cost.  */
+  if (cand->doloop_p && use->type == USE_NONLINEAR_EXPR)
+    cost += targetm.doloop_cost_for_generic;
+
   return get_scaled_computation_cost_at (data, at, cost);
is adjusted before scaling.  Please work consistently.

+      /* Simply use 1.5 * add cost for now, FIXME if there is some
more accurate
+        cost evaluation way.  */
+      cost = comp_cost (1.5 * add_cost (speed, mode), 0);
+      break;
Is 1.5 important for some test cases?  Can we simply use 1 instead?
Or at least use xxx * 2 / 3 in order to avoid floating number.

Not sure if non-ivopts parts are already approved?  If so, the patch
is okay with above issues addressed.

Thanks very much for your time!

Thanks,
bin
>
>
> Thanks,
> Kewen
>
> -----
>
> gcc/ChangeLog
>
> 2019-08-23  Kewen Lin  <li...@gcc.gnu.org>
>
>         PR middle-end/80791
>         * config/rs6000/rs6000.c (TARGET_HAVE_COUNT_REG_DECR_P): New macro.
>         (TARGET_DOLOOP_COST_FOR_GENERIC): Likewise.
>         (TARGET_DOLOOP_COST_FOR_ADDRESS): Likewise.
>         * target.def (have_count_reg_decr_p): New hook.
>         (doloop_cost_for_generic): Likewise.
>         (doloop_cost_for_address): Likewise.
>         * doc/tm.texi.in (TARGET_HAVE_COUNT_REG_DECR_P): Likewise.
>         (TARGET_DOLOOP_COST_FOR_GENERIC): Likewise.
>         (TARGET_DOLOOP_COST_FOR_ADDRESS): Likewise.
>         * doc/tm.texi: Regenerate.
>         * tree-ssa-loop-ivopts.c (comp_cost::operator+=): Consider infinite 
> cost
>         addend.
>         (record_group): Init doloop_p.
>         (add_candidate_1): Add optional argument doloop, change the handlings
>         accordingly.
>         (add_candidate): Likewise.
>         (generic_predict_doloop_p): Update attribute.
>         (force_expr_to_var_cost): Add costing for expressions 
> COND_EXPR/LT_EXPR/
>         LE_EXPR/GT_EXPR/GE_EXPR/EQ_EXPR/NE_EXPR/UNORDERED_EXPR/ORDERED_EXPR/
>         UNLT_EXPR/UNLE_EXPR/UNGT_EXPR/UNGE_EXPR/UNEQ_EXPR/LTGT_EXPR/MAX_EXPR/
>         MIN_EXPR.
>         (get_computation_cost): Update for doloop IV cand extra cost.
>         (determine_group_iv_cost_cond): Update for doloop IV cand.
>         (determine_iv_cost): Likewise.
>         (ivopts_estimate_reg_pressure): Likewise.
>         (may_eliminate_iv): Update handlings for doloop IV cand.
>         (add_iv_candidate_for_doloop): New function.
>         (find_iv_candidates): Call function add_iv_candidate_for_doloop.
>         (iv_ca_set_no_cp): Update for doloop IV cand.
>         (iv_ca_set_cp): Likewise.
>         (iv_ca_dump): Dump register cost.
>         (find_doloop_use): New function.
>         (analyze_and_mark_doloop_use): Likewise.
>         (tree_ssa_iv_optimize_loop): Call function 
> analyze_and_mark_doloop_use.
>
> gcc/testsuite/ChangeLog
>
> 2019-08-23  Kewen Lin  <li...@gcc.gnu.org>
>
>         PR middle-end/80791
>         * gcc.dg/tree-ssa/ivopts-3.c: Adjust for doloop change.
>         * gcc.dg/tree-ssa/ivopts-lt.c: Likewise.
>         * gcc.dg/tree-ssa/pr32044.c: Likewise.
>
>

Reply via email to