On 10/5/12, Diego Novillo <dnovi...@google.com> wrote:
> On Oct 5, 2012 Richard Guenther <richard.guent...@gmail.com> wrote:
> > Sorry, that wasn't intended.  I question these numbers because
> > unless you bootstrap say 100 times the noise in bootstrap
> > speed is way too high to make such claims.  Of course critical
> > information is missing:
>
> I agree with Nathan.  Your tone is sometimes borderline insulting.
> It creates unnecessary friction and does not serve anybody's
> purpose.  There is no need to be so antagonistic at all times.
>
> > "The new code bootstraps .616% faster with a 99% confidence of
> > being faster."
> >
> > 99% confidence on what basis?  What's your sample size?
>
> Perhaps Lawrence can explain a bit more how he's getting these
> numbers.  But they are not pulled out of thin air and he does go to
> the extra effort of measuring them and computing the differences.

The intent of the work is to compare the performance of the
unmodified compiler and the compiler with my patch.

For each compiler, I run the third stage of the boot with
-ftime-report ten times.  By running the third stage, I test
two things.  Stage two has all the benefits of any performance
improvements that the restructuring can deliver to end users.
But since it is compiling stage three, it is also accounting for
any increased time that the new C++ code takes in the bootstrap.
The end customer won't pay that cost, but it was a concern among
GCC developers.

By parsing the log files, I extract total CPU time for each run.
So, I have two samples, each with ten data points.  Each sample
has a sampled mean and a variance, from which you can compute
a confidence interval, in which the true mean is likely to be.
You can then compare the two confidence intervals to determine
the likely hood that one is better or worse than the other.  So,
in the statement "The new code bootstraps .616% faster with a 99%
confidence of being faster", the last phrase says if we were to
run that same experiment 100 times, we might get one case where
the compiler was slower.

For most purposes, a 95% confidence is sufficient for medical
interventions.  Compile-time isn't that important, so we could
easily get by on 70% confidence.

In any event, the sample size is only relevant to the extent
that larger sample sizes yield more confidence.  More consistent
runs also yield more confidence.  Algorithmic changes, which would
yield larger difference, would also yield more confidence.  Since I
report the confidence, you don't need to worry about sample size
or isolation from system contention, etc.  All those issues would
have affected confidence.

> > Why does the patch need this kind of "marketing"?
>
> Because (a) we have always said that we want to make sure that
> the C++ conversion provides useful benefits, and (b) there has
> been so much negative pressure on our work, that we sometimes
> try to find some benefit when reality may provide neutral results.

Yes, in particular, there was some concern that the cost of compiling
the templates used in the hash tables would increase the bootstrap
time significantly.  In these cases, I have shown that the benefit
of using them exceeds the cost of compiling them.

If no one cares about these time reports, then I will gladly stop
spending the effort to make them.

> > > I, for one, think that it's excellent that Lawrence is
> > > writing these cleanup patches and measuring what impact they
> > > have on performance.  Bonus points that they are making the
> > > compiler faster.  Speed of the compiler *is* a scalability
> > > issue, and it's one that GCC doesn't appear to have paid all
> > > that much attention to over the years.
> >
> > I just don't believe the 0.5% numbers.
>
> Then ask.  Don't mock, please.

-- 
Lawrence Crowl

Reply via email to