> On 01/23/2018 11:02 AM, Jan Hubicka wrote:
> >> On 01/19/2018 12:57 PM, Martin Liška wrote:
> >>> Yes, there's a huge difference in between CPU 2006 and 2017. Former has 
> >>> 63% w/ dominant edges,
> >>> and later one only 11%. It's caused by these 2 benchmarks with a high 
> >>> coverage:
> >>>
> >>
> >> Hi.
> >>
> >> I'm sending details about the 2 edges that influence the statistics 
> >> significantly:
> >>
> >>> 500.perlbench_r: regexec.c.065i.profile:
> >>>   negative return heuristics of edge 1368->1370: 2.0%  exec 2477714850 
> >>> hit 2429863555 (98.1%)
> >>
> >> 2477714850: 3484:S_regtry(pTHX_ regmatch_info *reginfo, char **startposp)
> >>         -: 3485:{
> >> 2477714850: 3486:    CHECKPOINT lastcp;
> >> 2477714850: 3487:    REGEXP *const rx = reginfo->prog;
> >> 2477714850: 3488:    regexp *const prog = ReANY(rx);
> >> 2477714850: 3489:    SSize_t result;
> >> [snip]
> >> 2477714850: 8046:    assert(!result ||  locinput - reginfo->strbeg >= 0);
> >> 2477714850: 8047:    return result ?  locinput - reginfo->strbeg : -1;
> >>         -:  8048:}
> >>
> >> As seen it return -1 if a regex is not found, which is in case of 
> >> perlbench very likely branch.
> >>
> >>>
> >>> and 523.xalancbmk_r:
> >>> build/build_peak_gcc7-m64.0000/NameDatatypeValidator.cpp.065i.profile:  
> >>> negative return heuristics of edge 3->4: 2.0%  exec 1221735072 hit 
> >>> 1221522453 (100.0%)
> >>
> >> 1221735072:   74:int NameDatatypeValidator::compare(const XMLCh* const 
> >> lValue
> >>         -:   75:                                   , const XMLCh* const 
> >> rValue
> >>         -:   76:                                   ,       MemoryManager*  
> >>    const)
> >>         -:   77:{
> >> 1221735072:   78:    return ( XMLString::equals(lValue, rValue)? 0 : -1);
> >>         -:   79:}
> >>         -:   80:
> >>
> >> IP profile dump file:
> >>
> >> xercesc_2_7::NameDatatypeValidator::compare (struct NameDatatypeValidator 
> >> * const this, const XMLCh * const lValue, const XMLCh * const rValue, 
> >> struct MemoryManager * const D.17157)
> >> {
> >>   bool _1;
> >>   int iftmp.0_2;
> >>
> >>   <bb 2> [count: 1221735072]:
> >>   # DEBUG BEGIN_STMT
> >>   _1 = xercesc_2_7::XMLString::equals (lValue_4(D), rValue_5(D));
> >>   if (_1 != 0)
> >>     goto <bb 4>; [0.02%]
> >>   else
> >>     goto <bb 3>; [99.98%]
> >>
> >>   <bb 3> [count: 1221522453]:
> >>
> >>   <bb 4> [count: 1221735072]:
> >>   # iftmp.0_2 = PHI <0(2), -1(3)>
> >>   return iftmp.0_2;
> >>
> >> }
> >>
> >> Likewise, XML values are more commonly non-equal.
> >> Ok, so may I mark also negative return with PROB_EVEN to track it?
> > 
> > Well, if we start disabling predictors just because we can find benchmark 
> > where they do not
> > perform as expected, we will need to disable them all. It is nature of 
> > heuristics to make
> > mistakes, just they should do something useful for common code.
> > It is not goal to make heuristics that makes no mistake.
> > Heuristics is useful either if it have high precision even if it cover few
> > branches (say noreturn), or if it have large coverage and still some 
> > hitrate (say opcode,
> > call or return based heuristics).
> > 
> > To make things bit more esoteric, in practice this is also bit weighted by 
> > fact
> > how much damage bad prediction does.  For example, call heuristics which 
> > predict
> > non-call path to be taken is likely going to do little damage. Even if call
> > path is common it is heavy and hard to optimize by presence of call and 
> > thus we
> > don't loose that much in common scenarios.
> > 
> > In the second case:
> > 00073 // 
> > -----------------------------------------------------------------------
> > 00074 // Compare methods
> > 00075 // 
> > -----------------------------------------------------------------------
> > 00076 int NameDatatypeValidator::compare(const XMLCh* const lValue
> > 00077                                    , const XMLCh* const rValue
> > 00078                                    ,       MemoryManager*     const)
> > 00079 {
> > 00080     return ( XMLString::equals(lValue, rValue)? 0 : -1);
> > 00081 }
> > 
> > I would say it is odd coding style.  I do not see why it is not declared
> > bool and not returning true/false. 
> > 
> > First case seems bit non-standard too as it looks like usual cache lookup 
> > just
> > the cache frequently fails.
> > 
> > I would be in favor of keeping the prediction with non-50% hitrate thus 
> > unless
> > we run into some performance problems.
> > If there are only two benchmarks out of say 20 we tried (in spec2000 and 
> > 2k17)
> > it seems fine to me.
> > 
> > There is issue with the way we collect our data.  Using arithmetic mean 
> > across
> > spec is optimizing for overall runtime of all spec benchmarks combined
> > together.  We more want to look for safer values that do well for average
> > benchmarks independently how many branches they do.  We could consider 
> > making
> > the statistics script also collect geometric averages across different
> > benchmarks. If we will see big difference between arithmetic mean and 
> > geomean
> > we will know there is small subset of benchmarks which dominates and we 
> > could
> > decide what to do with the probability.
> 
> Hello.
> 
> I fully agree that proper way to do statistics is to do a weight average 
> across
> benchmarks how each predictor hits performance. Which is nice, but quite hard
> to achieve. Let's talk personally how we can change it in next stage1.
> 
> For now, in case of the 'negative return' predictor, I tend to leave it as it 
> is
> and install the rest of the patch that makes PROB_EVEN probability.

OK, makes sense to me.
Honza
> 
> Works for you Honza?
> Martin
> 
> > 
> > Honza
> > 
> >>
> >> Thanks,
> >> Martin
> >>
> >>>
> >>> Ideas what to do with the predictor for GCC 8 release?
> >>> Martin

Reply via email to