on 2019/4/27 上午11:44, Bin.Cheng wrote: > On Fri, Apr 26, 2019 at 2:44 PM Kewen.Lin <li...@linux.ibm.com> wrote: >> + >> + /* zero cost use makes it easier to select memory based iv cand >> + for replacement of non memory based iv and its use. But if >> + the setup sequence are too costly, loop iv analysis can NOT >> + easily figure out it's finite, it's possible to stop the >> + low-overhead loop transformation and get unexpected code. */ >> + if (use->zero_cost_p && cand->iv->base_object && !use->iv->base_object >> + && elim_cost.cost >= 30) >> + dont_elim_p = true; > No, we'd like to avoid such things in general. The conditions look > like a hack to me. elim_cost is compared to express_cost, adding > another check on it at different place isn't really good, especially > 30 itself is a magic number. It's most likely improvement in some > cases, deterioration in others. >
Yes, I agree it's too hacky and unacceptable as a formal fix. And I tried to investigate more whether it's a general issue but never got exposed. > Also it punishes one pass (IVOPTs here) because of other pass' (RTL) > problem. It does't mean we can't do such transformations, only it has > to be as precise/conservative as possible. For example, if RTL loop > iv is improved to handle the case in the future, who would remember to > come back and adjust this? > Good question! > GCC lacks the capability passing information to later passes. Gimple > analyzer worked hard collecting various information but discards it > entering RTL or earlier. Other examples are like runtime alias > information, non-wrapping information for specific operations, etc. > IMHO, this is what needs to be done. As for this case, it could be > finite loop info, or non-wrapping info of the iv_var's increment > operation. By passing more information, RTL passes can be simplified > too. > Thanks for the information! Is there any under development work for this? That would be fine if we can pass down those information to downstream passes based on upcoming feature. > Thanks, > bin >> + >> /* The bound is a loop invariant, so it will be only computed >> once. */ >> elim_cost.cost = adjust_setup_cost (data, elim_cost.cost); >> @@ -5184,7 +5195,7 @@ determine_group_iv_cost_cond (struct ivopts_data *data, >> express_cost += bound_cost; >> >> /* Choose the better approach, preferring the eliminated IV. */ >> - if (elim_cost <= express_cost) >> + if (elim_cost <= express_cost && !dont_elim_p) >> { >> >> >> I was thinking whether this zero cost change just exposed >> an existing problem, then this kind of check should be for all >> cases not only for zero cost use, similar to >> expression_expensive_p? But why doesn't it happen before? >> Need more investigation. >> >>> >>>> Btw, this is for GCC10. >>> >>> *Phew* ;-) >>> >>> >>> Some trivial comments: >>> >>>> +static bool >>>> +invalid_insn_for_doloop_p (struct loop *loop) >>>> +{ >>>> + basic_block *body = get_loop_body (loop); >>>> + unsigned num_nodes = loop->num_nodes; >>>> + gimple_stmt_iterator gsi; >>>> + unsigned i; >>>> + >>>> + for (i = 0; i < num_nodes; i++) >>> >>> for (unsigned i = 0; i < num_nodes; i++) >>> >>> (and maybe you can just say loop->num_nodes here; I don't know if that >>> generates worse code, or if that even matters). >> >> Good idea, will fix it. >> >>> >>>> + if (dump_file && (dump_flags & TDF_DETAILS)) >>>> + fprintf ( >>>> + dump_file, >>>> + "predict doloop failure due to finding computed jump.\n"); >>> >>> We don't normally end lines in (. There are other solutions to why you >>> did that here; you can use string pasting, to break the string into two, >>> or factor out (some part of) the loop body to reduce indentation. >>> >> >> Will adjust it. >> >>> >>> Segher >>> >> >