On Tue, Oct 22, 2013 at 12:01 AM, Yufeng Zhang <yufeng.zh...@arm.com> wrote: > Hi, > > This patch changes the widening_mul pass to fuse the widening multiply with > accumulate only when the multiply has single use. The widening_mul pass > currently does the conversion regardless of the number of the uses, which > can cause poor code-gen in cases like the following: > > typedef int ArrT [10][10]; > > void > foo (ArrT Arr, int Idx) > { > Arr[Idx][Idx] = 1; > Arr[Idx + 10][Idx] = 2; > } > > On AArch64, after widening_mul, the IR is like > > _2 = (long unsigned int) Idx_1(D); > _3 = Idx_1(D) w* 40; <---- > _5 = Arr_4(D) + _3; > *_5[Idx_1(D)] = 1; > _8 = WIDEN_MULT_PLUS_EXPR <Idx_1(D), 40, 400>; <---- > _9 = Arr_4(D) + _8; > *_9[Idx_1(D)] = 2; > > Where the arrows point, there are redundant widening multiplies. > > Bootstrap successfully on x86_64. > > The patch passes the regtest on aarch64, arm and x86_64. > > OK for the trunk?
if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1, - &type2, &mult_rhs2)) + &type2, &mult_rhs2) + || !has_single_use (rhs1)) please check has_single_use first, it's the cheaper check. Ok with that change (and possibly a testcase). Thanks, Richard. > Thanks, > Yufeng > > p.s. Note that x86_64 doesn't suffer from this issue as the corresponding > widening multiply accumulate op is not available on the target. > > gcc/ > > * tree-ssa-math-opts.c (convert_plusminus_to_widen): Call > has_single_use () and not do the conversion if has_single_use () > returns false for the multiplication result.