On Thu, Jul 13, 2017 at 2:18 AM, Kugan Vivekanandarajah
<kugan.vivekanandara...@linaro.org> wrote:
> I am looking into reversing loop to increased efficiency. There is
> already a PR22041 for this and an old patch
> https://gcc.gnu.org/ml/gcc-patches/2006-01/msg01851.html by Zdenek
> which never made it to mainline.
>
> For constant loop count, ivcanon pass is adding reverse iv but this
> not selected by ivopt.

ivopt will never do loop reversal, if it would select this IV it would need to
keep the original one as well or compute the old i based on the new.

loop reversal needs dependence analysis and it's not clear if for the copy
case you quote would be profitable (HW prefetchers like that?).  For your
case it's also invalid as a and c may overlap.

> For example:
>
> void copy (unsigned int N, double *a, double *c)
> {
>   for (int i = 0; i < 800; ++i)
>   c[i] = a[i];
> }
>
> ivcanon pass Added canonical iv to loop 1, 799 iterations.
> ivtmp_14 = ivtmp_15 – 1;
>
> in ivopt, it selects candidates 10
>
> Candidate 10:
> Var befor: ivtmp.11
> Var after: ivtmp.11
> Incr POS: before exit test
> IV struct:
> Type: sizetype
> Base: 0
> Step: 8
> Biv: N
>
> If we look at the group :
>
> Group 0:
> Type: ADDRESS
> Use 0.0:
> At stmt: _5 = *_3;
> At pos: *_3
> IV struct:
> Type: double *
> Base: a_9(D)
> Step: 8
> Object: (void *) a_9(D)
> Biv: N
> Overflowness wrto loop niter: Overflow
>
> Group 1:
> Type: ADDRESS
> Use 1.0:
> At stmt: *_4 = _5;
> At pos: *_4
> IV struct:
> Type: double *
> Base: c_10(D)
> Step: 8
> Object: (void *) c_10(D)
> Biv: N
> Overflowness wrto loop niter: Overflow
>
> Group 2:
> Type: COMPARE
> Use 2.0:
> At stmt: if (ivtmp_14 != 0)
> At pos: ivtmp_14
> IV struct:
> Type: unsigned int
> Base: 799
> Step: 4294967295
> Biv: Y
> Overflowness wrto loop niter: Overflow
>
> ivopt cost model assumes that group0 and 1 will have infinite cost for
> the iv added by ivcanon pass because of the lower precision with the
> IV added by ivcanon pass.
>
> If I change the example to:
>
> void copy (unsigned int N, double *a, double *c)
> {
>  for (long i = 0; i < 800; ++i)
>  c[i] = a[i];
> }
>
> It still has higher cost for group0 and 1 due to the negative step. I
> think this can be improved. My question is:
>
> 1. For the case where the loop count is not constant, can we make
> ivcanon to add reverse IV with the current implementation. Can ivopt
> be taught to select the reverse iv ?
>
> 2. Or is the patch by Zdenek a better option. I am re-basing it for the trunk.
>
> Thanks,
> Kugan

Reply via email to