> On Fri, 10 Jun 2016, Jan Hubicka wrote: > > > > > I always interpreted the estimated number of iterations to be the same > > > > as > > > > expected number of iterations and to be same as average. So it seems > > > > to be > > > > sane to feed the info from profile. > > > > > > > > I am thinking to add the histograms, yes. It is midly anoying to do so > > > > becuase > > > > one needs to intrstrument all exit edges out of loop. I guess we don't > > > > care > > > > much if histogram will get lost on abnormals. > > > > > > Can't we just instrument the loop header (and thus get "averages" for all > > > exits)? > > > > You need to know number of iterations, so either you save it into a global > > and > > flush into profile next time edge header->preheader is executed or you > > instrument > > exists. At least when one wants something more precise than average > > iteration count > > I think. > > > > > > > One option I am thinking about is to introduce counter that will take > > > > two > > > > parameters A,B. It will record linear histogram for values in range > > > > 0...A and > > > > logarithmic histogram for values greater than A capping by 2^B (so we > > > > don't > > > > need 64 counters for every loop). I think we are not really that > > > > interested in > > > > precise histograms for loops that iterate more than, say, 2^10 times. > > > > We only > > > > need to know that it loops a lot. We however care about low iteration > > > > counts > > > > to make good decision for peeling. This is still 26 counters per loop > > > > that is > > > > quite a lot. > > > > > > I think what we are interested in is "clusters" of niters. So yes, > > > having a histogram for low values (can we use HIST_TYPE_POW2 here?) > > > makes sense. Or simply sum all n <= 16 and all n > 16 values separately? > > > We seem to lack a histogram type where we can specify N differently-sized > > > bins. > > > > HIST_TYPE_POW2 measure how often value is exactly power of two and we don't > > have histogram counter at all, so we need to introduce one ;) > > > > We have HIST_TYPE_INTERVAL that is real linear histogram for given interval > > of values. > > > > > > > A lot cheaper alternative may be to simply measure loop peeling limit > > > > by > > > > having counter that counts how often loop exits in first > > > > PARAM_MAX_PEEL_TIMES iterations and second counter that measure what is > > > > the maximal number of iterations in this case (we really want likely > > > > maximal number of iterations but that seems harder to get). This will > > > > determine peeling limit which we can then store into loop structure. > > > > > > Sure. > > > > > > In my original loop value-profiling work the most interesting (for > > > performance) case was value-profiling variable strides and then > > > versioning for stride == 1. That triggered a lot with fortran > > > array descriptors though nowadays IPA-CP of aggregates plus LTO > > > might have solved this. > > > > ipa-cp is not very powerful on propagating those. Measuring strides is > > interesting idea. I forgot about your work on the loop histograms. Do you > > still have patches around? > > No, but those weren't complete anyway as we didn't preserve loops > back in time. I think I only implemented versioning assuming the > stride would be one unconditionally to show the possible benefit > (it was for the 2006 Summit).
OK. I think i will start with the iteration histograms (small values separately up to peel limit and large values in log2) and we will see the overall cost of this. We can always backtrack to (iteration count #) cheaper option or try to optimize things a bit. Honza > > Richard.