------- Comment #168 from rguenther at suse dot de  2007-06-05 21:17 -------
Subject: Re:  [4.0/4.1/4.2/4.3 Regression] placement
 new does not change the dynamic type as it should

On Tue, 5 Jun 2007, ian at airs dot com wrote:

> ------- Comment #167 from ian at airs dot com  2007-06-05 20:48 -------
> Can you give me a .ii file for the performance regression, and point me at the
> relevant function?

http://www.suse.de/~rguenther/tramp3d/tramp3d-v4.cpp.gz

Amongst the interesting functions (yep, there are multiple) are
those called Momentumflux*::operator(), one particular example is
Adv5::Z::MomentumfluxX<Dim>::operator(), which mangles as
_ZNK4Adv51Z13MomentumfluxXILi3EEclI5FieldI22UniformRectilinearMeshI10MeshTraitsILi3Ed21UniformRectilinearTag12CartesianTagLi3EEEd7CompFwdI6EngineILi3E6VectorILi3Ed4FullE10BrickViewUE3LocILi1EEEES4_ISA_dSG_ESM_SM_EEvRKT_RKT0_RKT1_RKT2_RKSI_ILi3EE
it has this initialization loop (which is fixed by the tramp3d patch)
inside the computational kernel (triple nested loop):

<bb 35>:
  D.760598_367 = &D.464122.engine_m.x_m[i_368];
  <<<change_dynamic_type (double *) D.760598_367)>>>
  iftmp.913_369 = &D.464122.engine_m.x_m[i_368];
  if (1)
    goto <bb 36>;
  else
    goto <bb 37>;

<bb 36>:
  *iftmp.913_369 = 0.0;

<bb 37>:
  i_370 = i_368 + 1;

<bb 38>:
  # i_368 = PHI <0(34), i_370(37)>
  if (i_368 <= 2)
    goto <bb 35>;
  else
    goto <bb 39>;

(that's after forwprop1 actually).  It's important that we unroll
this loop completely (to make us recognize the 0.0 stores are
all super-seeded by later stores) and that we move all loop invariant loads
out of the triple-nested loops, crossing this initialization loop.

The optimized dump for all these functions should be 'easy to grasp and
obviously fast' - at least that's what it used to be.  Now with this
patch we stil have

  <<<change_dynamic_type (double *) &D.767646.engine_m.x_m[0])>>>  
D.767646.engine_m.x_m[0] = 0.0;
  <<<change_dynamic_type (double *) &D.767646.engine_m.x_m[1])>>>  
D.767646.engine_m.x_m[1] = 0.0;
  <<<change_dynamic_type (double *) &D.767646.engine_m.x_m[2])>>>
  D.767646.engine_m.x_m[2] = 0.0;

in there and loads of index/domain variables on the MEM expressions like

  MEM[base: &D.767646, index: D.1312916, step: 8] = 
D.767266->origin_m.engine_m.x_m[i] + D.767266->spacings_m.engine_m.x_m[i] 
* (double) (MEM[base: &D.767512, index: D.1312916, step: 4] - 
D.767266->D.225459.physicalCellDomain_m.D.114276.D.113975.domain_m[i].D.110801.D.45225.D.45039.domain_m[0]);

Richard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29286

Reply via email to