On Wed, Apr 25, 2012 at 1:14 PM, Richard Guenther <richard.guent...@gmail.com> wrote: > On Wed, Apr 25, 2012 at 10:56 AM, Igor Zamyatin <izamya...@gmail.com> wrote: >> Hi all! >> >> I'd like to post for review the patch which makes some costs adjusting >> in get_computation_cost_at routine in ivopts part. >> As mentioned in the PR changes also fix the bwaves regression from PR 52272. >> Also changes introduce no degradations on spec2000/2006 and >> EEMBC1.1/2.0(this was measured on Atom) on x86 >> >> >> Bootstrapped and regtested on x86. Ok to commit? > > I can't make sense of the patch and the comment does not help. > > + diff_cost = cost.cost; > cost.cost /= avg_loop_niter (data->current_loop); > + add_cost_val = add_cost (TYPE_MODE (ctype), data->speed); > + /* Do cost correction if address cost is small enough > + and difference cost is high enough. */ > + if (address_p && diff_cost > add_cost_val > + && get_address_cost (symbol_present, var_present, > + offset, ratio, cstepi, > + TYPE_MODE (TREE_TYPE (utype)), > + TYPE_ADDR_SPACE (TREE_TYPE (utype)), > + speed, stmt_is_after_inc, > + can_autoinc).cost <= add_cost_val) > + cost.cost += add_cost_val; > > Please explain more thoroughly. It also would seem to be better to add > an extra case, as later code does
For example for such code for (j=0; j<M;j++) { for (i=0; i<N; i++) sum += ptr->a[j][i] * ptr->c[k][i]; } we currently have following gimple on x86 target (I provided a piece of all phase output): # ivtmp.13_30 = PHI <ivtmp.13_31(3), ivtmp.13_33(7)> D.1748_34 = (void *) ivtmp.13_30; D.1722_7 = MEM[base: D.1748_34, offset: 0B]; D.1750_36 = ivtmp.27_28; D.1751_37 = D.1750_36 + ivtmp.13_30; <-- we got non-invariant add which is not taken into account currently in cost model D.1752_38 = (void *) D.1751_37; D.1753_39 = (sizetype) k_8(D); D.1754_40 = D.1753_39 * 800; D.1723_9 = MEM[base: D.1752_38, index: D.1754_40, offset: 16000B]; ... With proposed fix we produce: # ivtmp.14_30 = PHI <ivtmp.14_31(3), 0(7)> D.1749_34 = (struct S *) ivtmp.25_28; D.1722_7 = MEM[base: D.1749_34, index: ivtmp.14_30, offset: 0B]; D.1750_35 = (sizetype) k_8(D); D.1751_36 = D.1750_35 * 800; D.1752_37 = ptr_6(D) + D.1751_36; D.1723_9 = MEM[base: D.1752_37, index: ivtmp.14_30, offset: 16000B]; which is more effective on platforms where address cost is cheaper than cost of addition operation. That's basically what this adjustment is for. So comment in the source code now looks as follows /* Do cost correction when address difference produces additional non-invariant add operation which is less profitable if address cost is cheaper than cost of add. */ > > /* Now the computation is in shape symbol + var1 + const + ratio * var2. > (symbol/var1/const parts may be omitted). If we are looking for an > address, find the cost of addressing this. */ > if (address_p) > return add_costs (cost, > get_address_cost (symbol_present, var_present, > offset, ratio, cstepi, > TYPE_MODE (TREE_TYPE (utype)), > TYPE_ADDR_SPACE (TREE_TYPE (utype)), > speed, stmt_is_after_inc, > can_autoinc)); > > thus refactoring the code a bit would make it possible to CSE the > get_address_cost > call and eventually make it clearer what the code does. 'offset' could be changed beetween two calls of get_address_cost so such refactoring looks useless. New patch (only the comment was changed) attached. Changelog was changed as well. > > Richard. > Changelog: 2012-04-26 Yuri Rumyantsev <yuri.rumyant...@intel.com> * tree-ssa-loop-ivopts.c (get_computation_cost_at): Adjust cost model when address difference produces additional non-invariant add operation which is less profitable if address cost is cheaper than cost of add. Thanks, Igor
ivopts1.patch
Description: Binary data