On Tue, Oct 22, 2013 at 12:01 AM, Yufeng Zhang <yufeng.zh...@arm.com> wrote:
> Hi,
>
> This patch changes the widening_mul pass to fuse the widening multiply with
> accumulate only when the multiply has single use.  The widening_mul pass
> currently does the conversion regardless of the number of the uses, which
> can cause poor code-gen in cases like the following:
>
> typedef int ArrT [10][10];
>
> void
> foo (ArrT Arr, int Idx)
> {
>   Arr[Idx][Idx] = 1;
>   Arr[Idx + 10][Idx] = 2;
> }
>
> On AArch64, after widening_mul, the IR is like
>
>   _2 = (long unsigned int) Idx_1(D);
>   _3 = Idx_1(D) w* 40;                           <----
>   _5 = Arr_4(D) + _3;
>   *_5[Idx_1(D)] = 1;
>   _8 = WIDEN_MULT_PLUS_EXPR <Idx_1(D), 40, 400>; <----
>   _9 = Arr_4(D) + _8;
>   *_9[Idx_1(D)] = 2;
>
> Where the arrows point, there are redundant widening multiplies.
>
> Bootstrap successfully on x86_64.
>
> The patch passes the regtest on aarch64, arm and x86_64.
>
> OK for the trunk?

       if (!is_widening_mult_p (rhs1_stmt, &type1, &mult_rhs1,
-       &type2, &mult_rhs2))
+       &type2, &mult_rhs2)
+  || !has_single_use (rhs1))

please check has_single_use first, it's the cheaper check.

Ok with that change (and possibly a testcase).

Thanks,
Richard.



> Thanks,
> Yufeng
>
> p.s. Note that x86_64 doesn't suffer from this issue as the corresponding
> widening multiply accumulate op is not available on the target.
>
> gcc/
>
>         * tree-ssa-math-opts.c (convert_plusminus_to_widen): Call
>         has_single_use () and not do the conversion if has_single_use ()
>         returns false for the multiplication result.

Reply via email to