On 10/5/12, Diego Novillo <dnovi...@google.com> wrote: > On Oct 5, 2012 Richard Guenther <richard.guent...@gmail.com> wrote: > > Sorry, that wasn't intended. I question these numbers because > > unless you bootstrap say 100 times the noise in bootstrap > > speed is way too high to make such claims. Of course critical > > information is missing: > > I agree with Nathan. Your tone is sometimes borderline insulting. > It creates unnecessary friction and does not serve anybody's > purpose. There is no need to be so antagonistic at all times. > > > "The new code bootstraps .616% faster with a 99% confidence of > > being faster." > > > > 99% confidence on what basis? What's your sample size? > > Perhaps Lawrence can explain a bit more how he's getting these > numbers. But they are not pulled out of thin air and he does go to > the extra effort of measuring them and computing the differences.
The intent of the work is to compare the performance of the unmodified compiler and the compiler with my patch. For each compiler, I run the third stage of the boot with -ftime-report ten times. By running the third stage, I test two things. Stage two has all the benefits of any performance improvements that the restructuring can deliver to end users. But since it is compiling stage three, it is also accounting for any increased time that the new C++ code takes in the bootstrap. The end customer won't pay that cost, but it was a concern among GCC developers. By parsing the log files, I extract total CPU time for each run. So, I have two samples, each with ten data points. Each sample has a sampled mean and a variance, from which you can compute a confidence interval, in which the true mean is likely to be. You can then compare the two confidence intervals to determine the likely hood that one is better or worse than the other. So, in the statement "The new code bootstraps .616% faster with a 99% confidence of being faster", the last phrase says if we were to run that same experiment 100 times, we might get one case where the compiler was slower. For most purposes, a 95% confidence is sufficient for medical interventions. Compile-time isn't that important, so we could easily get by on 70% confidence. In any event, the sample size is only relevant to the extent that larger sample sizes yield more confidence. More consistent runs also yield more confidence. Algorithmic changes, which would yield larger difference, would also yield more confidence. Since I report the confidence, you don't need to worry about sample size or isolation from system contention, etc. All those issues would have affected confidence. > > Why does the patch need this kind of "marketing"? > > Because (a) we have always said that we want to make sure that > the C++ conversion provides useful benefits, and (b) there has > been so much negative pressure on our work, that we sometimes > try to find some benefit when reality may provide neutral results. Yes, in particular, there was some concern that the cost of compiling the templates used in the hash tables would increase the bootstrap time significantly. In these cases, I have shown that the benefit of using them exceeds the cost of compiling them. If no one cares about these time reports, then I will gladly stop spending the effort to make them. > > > I, for one, think that it's excellent that Lawrence is > > > writing these cleanup patches and measuring what impact they > > > have on performance. Bonus points that they are making the > > > compiler faster. Speed of the compiler *is* a scalability > > > issue, and it's one that GCC doesn't appear to have paid all > > > that much attention to over the years. > > > > I just don't believe the 0.5% numbers. > > Then ask. Don't mock, please. -- Lawrence Crowl