On 2019/11/21 8:10 PM, Richard Biener wrote:
On Thu, Nov 21, 2019 at 10:22 AM Li Jia He <heli...@linux.ibm.com> wrote:

Hi,

I found for the follow code:

#define N 256
int a[N][N][N], b[N][N][N];
int d[N][N], c[N][N];
void __attribute__((noinline))
double_reduc (int n)
{
    for (int k = 0; k < n; k++)
    {
      for (int l = 0; l < n; l++)
       {
         c[k][l] = 0;
          for (int m = 0; m < n; m++)
            c[k][l] += a[k][m][l] * d[k][m] + b[k][m][l] * d[k][m];
       }
    }
}

I dumped the file after loop interchange and got the following information:

<bb 3> [local count: 118111600]:
    # m_46 = PHI <0(7), m_45(11)>
    # ivtmp_44 = PHI <_42(7), ivtmp_43(11)>
    _39 = _49 + 1;

    <bb 4> [local count: 955630224]:
    # l_48 = PHI <0(3), l_47(12)>
    # ivtmp_41 = PHI <_39(3), ivtmp_40(12)>
    c_I_I_lsm.5_18 = c[k_28][l_48];
    c_I_I_lsm.5_53 = m_46 != 0 ? c_I_I_lsm.5_18 : 0;
    _2 = a[k_28][m_46][l_48];
    _3 = d[k_28][m_46];
    _4 = _2 * _3;
    _5 = b[k_28][m_46][l_48];
    _6 = _3 * _5;
    _7 = _4 + _6;
    _8 = _7 + c_I_I_lsm.5_53;
    c[k_28][l_48] = _8;
    l_47 = l_48 + 1;
    ivtmp_40 = ivtmp_41 - 1;
    if (ivtmp_40 != 0)
      goto <bb 12>; [89.00%]
    else
      goto <bb 5>; [11.00%]

we can see '_3 = d[k_28][m_46];'  is a loop invariant.
Do we need to add a loop invariant motion pass after the loop interchange?

There is one at the end of the loop pipeline.

Hi,

The one at the end of the loop pipeline may miss some optimization
opportunities.  If we vectorize the above code (a.c.158t.vect), we
can get information similar to the following:

bb 3:
 # m_46 = PHI <0(7), m_45(11)>  // loop m, outer loop
  if (_59 <= 2)
    goto bb 20;
  else
    goto bb 15;

bb 15:
  _89 = d[k_28][m_46];
  vect_cst__90 = {_89, _89, _89, _89};

bb 4:
   # l_48 = PHI <l_47(12), 0(15)> // loop l, inner loop
  vect__6.23_100 = vect_cst__99 * vect__5.22_98;
   if (ivtmp_110 < bnd.8_1)
    goto bb 12;
  else
    goto bb 17;

bb 20:
bb 18:
   _27 = d[k_28][m_46];
if (ivtmp_12 != 0)
    goto bb 19;
  else
    goto bb 21;

Vectorization will do some conversions in this case.  We can see
‘ _89 = d[k_28][m_46];’ and ‘_27 = d[k_28][m_46];’ are loop invariant
relative to loop l.  We can move ‘d[k_28][m_46]’ to the front of
‘if (_59 <= 2)’ to get rid of loading data from memory in both branches.

The one at at the end of the loop pipeline can't handle this situation.
If we move d[k_28][m_46] from loop l to loop m before doing
vectorization, we can get rid of this situation.

--
BR,
Lijia He


BR,
Lijia He


Reply via email to