http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50955

bin.cheng <amker.cheng at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amker.cheng at gmail dot com

--- Comment #17 from bin.cheng <amker.cheng at gmail dot com> ---
Hi Richard,
I am having difficulty in understanding cases if this PR.
For the reported case with two loops:

  for( y=0; y<4; y++, pDst += dstStep ) {
    for( x=y+1; x<4; x++ ) {
                    s = ( p1[x-y-1] + p1[x-y] + p1[x-y] + p1[x-y+1] + 2 ) >> 2;
                    pDst[x] = (unsigned char)s;
    }

    pDst[y] = p3;
  }

The dump for statement 'pDst[y] = p3;' before IVOPT is like:

<bb 4>:
Invalid sum of incoming frequencies 1667, should be 278
  y.2_64 = (sizetype) y_89;
  D.6421_65 = pDst_88 + y.2_64;
  *D.6421_65 = p3_37;
  pDst_69 = pDst_88 + pretmp.21_118;
  ivtmp.35_116 = ivtmp.35_87 - 1;
  if (ivtmp.35_116 != 0)
    goto <bb 18>;
  else
    goto <bb 19>;


IVOPT chooses candidate 15:
candidate 15
  depends on 3
  var_before ivtmp.154
  var_after ivtmp.154
  incremented before exit test
  type unsigned int
  base (unsigned int) pDst_39(D) - (unsigned int) &p1
  step (unsigned int) (pretmp.21_118 + 1)
for use 1:
use 1
  address
  in statement *D.6421_65 = p3_37;

  at position *D.6421_65
  type unsigned char *
  base pDst_39(D)
  step pretmp.21_118 + 1
  base object (void *) pDst_39(D)
  related candidates 

After rewriting, the dump is like:

<bb 4>:
Invalid sum of incoming frequencies 1667, should be 278
  MEM[symbol: p1, index: ivtmp.154_200, offset: 0B] = p3_37;
  pDst_69 = pDst_88 + pretmp.21_118;
  ivtmp.149_218 = ivtmp.149_249 - 1;
  ivtmp.154_190 = ivtmp.154_200 + D.6617_250;
  if (x_40 != 4)
    goto <bb 18>;
  else
    goto <bb 19>;

Eventually, the storing to TMR[p1,ivtmp,0] is considered local and deleted.

BUT, for your reduced case:

  p3 = (unsigned char)(((signed int)p1[1] + (signed int)p2[1]
                        + (signed int)p1[0] +(signed int)p1[0] + 2 ) >> 2 );

  for( x=y+1; x<4; x++ ) {
      s = ( p1[x-y-1] + p1[x-y] + p1[x-y] + p1[x-y+1] + 2 ) >> 2;
      pDst[x] = (unsigned char)s;
  }

  pDst[y] = p3;

It is about the the TMR in below dump (before IVOPT):

<bb 6>:
  # vect_pp1.30_166 = PHI <vect_pp1.30_167(16), vect_pp1.33_165(5)>
  # vect_pp1.37_176 = PHI <vect_pp1.37_177(16), vect_pp1.40_175(5)>
  # vect_pp1.46_194 = PHI <vect_pp1.46_195(16), vect_pp1.49_193(5)>
  # vect_p.60_223 = PHI <vect_p.60_224(16), vect_p.63_222(5)>
  # ivtmp.64_225 = PHI <ivtmp.64_226(16), 0(5)>
  ...
  MEM[(unsigned char *)vect_p.60_223] = vect_var_.58_219;
  vect_pp1.30_167 = vect_pp1.30_166 + 8;
  vect_pp1.37_177 = vect_pp1.37_176 + 8;
  vect_pp1.46_195 = vect_pp1.46_194 + 8;
  vect_p.60_224 = vect_p.60_223 + 8;
  ivtmp.64_226 = ivtmp.64_225 + 1;
  if (ivtmp.64_226 < bnd.27_128)
    goto <bb 16>;
  else
    goto <bb 7>;

Your patch prevents IVOPT from choosing cand 4:
candidate 4 (important)
  var_before ivtmp.110
  var_after ivtmp.110
  incremented before exit test
  type unsigned int
  base (unsigned int) (&p1 + 8)
  step 8
  base object (void *) &p1
for use 3:
use 3
  generic
  in statement vect_p.60_223 = PHI <vect_p.60_224(16), vect_p.63_222(5)>

  at position 
  type vector(8) unsigned char *
  base batmp.61_221 + 1
  step 8
  base object (void *) batmp.61_221
  is a biv
  related candidates 

To prevent IVOPT from rewriting into:

<bb 6>:
  # ivtmp.107_150 = PHI <ivtmp.107_256(16), 0(5)>
  # ivtmp.110_241 = PHI <ivtmp.110_146(16), ivtmp.110_132(5)>
  D.6585_133 = (unsigned int) batmp.61_221;
  p1.131_277 = (unsigned int) &p1;
  D.6587_278 = D.6585_133 - p1.131_277;
  D.6588_279 = D.6587_278 + ivtmp.110_241;
  D.6589_280 = D.6588_279 + 4294967289;
  D.6590_281 = (vector(8) unsigned char *) D.6589_280;
  vect_p.60_223 = D.6590_281;
  ...
  MEM[(unsigned char *)vect_p.60_223] = vect_var_.58_219;
  ivtmp.107_256 = ivtmp.107_150 + 1;
  ivtmp.110_146 = ivtmp.110_241 + 8;
  if (ivtmp.107_256 < bnd.27_128)
    goto <bb 16>;
  else
    goto <bb 7>;

Thus prevents IVOPT from generating candidate 15 in outer loop.  (Expressing
use 3 by cand 4 itself is good, right?)


-------------------------------
But,
It seems because the check:

  if (address_p)
    {
      /* Do not try to express address of an object with computation based
         on address of a different object.  This may cause problems in rtl
         level alias analysis (that does not expect this to be happening,
         as this is illegal in C), and would be unlikely to be useful
         anyway.  */
      if (use->iv->base_object
          && cand->iv->base_object
          && !operand_equal_p (use->iv->base_object, cand->iv->base_object, 0))
        return infinite_cost;

failed because cand(15)->iv->base_object == NULL.  For the reported case, it's
not about an iv use appearing in memory reference while not marked as
address_p, and can be fixed by revise the existing check condition, is it true?

PS, sorry for replying to a fixed PR, I found it's kind of impossible to fix
PR52272 without fully understanding this one.

Reply via email to