> On Thu, 9 Jun 2016, Jan Hubicka wrote:
> 
> > Hi,
> > after we read the profile, we know expected number of iterations.
> 
> We know the average ;)  It may make sense to add some histogram
> value profiling for niter now that we should easily able to do so.

I always interpreted the estimated number of iterations to be the same as
expected number of iterations and to be same as average.  So it seems to be
sane to feed the info from profile.

I am thinking to add the histograms, yes.  It is midly anoying to do so becuase
one needs to intrstrument all exit edges out of loop. I guess we don't care
much if histogram will get lost on abnormals.

One option I am thinking about is to introduce counter that will take two
parameters A,B.  It will record linear histogram for values in range 0...A and
logarithmic histogram for values greater than A capping by 2^B (so we don't
need 64 counters for every loop).  I think we are not really that interested in
precise histograms for loops that iterate more than, say, 2^10 times. We only
need to know that it loops a lot.  We however care about low iteration counts
to make good decision for peeling.  This is still 26 counters per loop that is
quite a lot.  

A lot cheaper alternative may be to simply measure loop peeling limit by having
counter that counts how often loop exits in first PARAM_MAX_PEEL_TIMES 
iterations
and second counter that measure what is the maximal number of iterations in this
case (we really want likely maximal number of iterations but that seems harder 
to
get). This will determine peeling limit which we can then store into loop 
structure.

Vectorizer/unroller/prefetcher and most of other classical loop opts care about
the average being large enough so the current expected value seems to do the
trick.

This does not let you to update profile very precisely after peeling (and also 
peeling
done by vectorizer), but it needs only 2 counters per loop.

What other passes, beside peeling, would immediately benefit from a historgram?
I wonder if you can think of better scheme?
> 
> > Currently we use profile each time estimate_numbers_of_iterations_loop
> > is called to recompute this value.  This is not very safe because the
> > profile may be misupdated.  It seems safer to compute it at once and
> > maintain thorough the compilation.
> > 
> > Notice that I removed:
> > -  /* Force estimate compuation but leave any existing upper bound in 
> > place.  */
> > -  loop->any_estimate = false;
> > From beggining of estimate_numbers_of_iterations_loop.  I can not make 
> > sense of
> > this. Even w/o profile if we have estimate, we are better to maintain it 
> > because
> > later we may not be able to derive it again.
> > There seems to be no code that is forced by setting loop->any_estimate = 
> > true.
> > Only code that cares seems to be record_niter_bound that only decreases 
> > existing
> > estimates. THis seems sane procedure - we don't roll loops.
> > 
> > Bootstrapped/regtested x86_64-linux, OK?
> 
> Ok.  Did you check what this does to SPEC with FDO?

I didn't do full SPEC run with FDO, only tested tramp3d and xalancbmk I have
readily available.  Any regressions would however point to loop info updating
bugs (or wrong use of the profile) so I will look into them if they appear.
I am trying to benchmark firefox now.

Honza

Reply via email to