On Mon, 2013-09-02 at 13:44 +0200, Michael Matz wrote: > Hi, > > On Fri, 30 Aug 2013, David Malcolm wrote: > > > Here's the result of a pair of builds of r202029 without and with the > > patches, configured with --enable-checking=release, running "make", then > > stripping debuginfo [1] > > > > So the overall sizes of such binaries are essentially unchanged. > > Yep, cool. > > > Any suggestions on what to compile to compare performance? By 60 > > seconds, do you mean 60s for one TU, or do you mean a large build e.g. > > the linux kernel? > > For one TU, so as to not measure any significant execve, as or IO > overhead. Some people compile e.g preprocessed variants of some GCC > source file, myself I'm measuring such speeds since some years with > unchanged versions of the attached (big-code.c, several large functions > with arithmetics and one-deep loops) which can be customized > by hand to produce different largeness and kdecore.cc [1] (real world C++ > library code from 2009). > > You can customize big-code.c to match some compile time goal by > commenting out some FUNC invocation for some types, or by changing the > body size of the functions by fiddling in the FUNC macro itself to invoke > more of the L2's or even L3's, but that's really slow.
Thanks. Everyone seems to have their own benchmarking test files - I'd like to gather those that I can into a common repository. What license is big-code.c under? Presumably kdecore.cc is from a compile of KDE and thus covered by the license of that (plus of all the headers that were included, I guess). I scripted compilation of each of big-code.c and kdecore.cc at -O3, using the cc1plus from each of the builds described above, 10 times (i.e. the build of r202029 without (control) and with the patches (experiment), both configured with --enable-checking=release, running "make", then stripping debuginfo). I wrote a script to parse the "TOTAL" timevar data from the logs, extracting the total timings for user, sys, and wallclock, and the ggc memory total. I used some code from Python's benchmarking suite to determine whether the observed difference is "significant", and to draw comparative graphs of the data (actually, to generate URLs to Google's Chart API). There were no significant differences for these data between the cc1plus with the patch and the cc1plus without. The script uses Student's two-tailed T test on the benchmark results at the 95% confidence level to determine significance, but it also has a cutoff in which when the averages are within 1% of each other they are treated as "insignificant". If I remove the cutoff, then the kdecore.cc wallclock time is reported as *faster* with the patch (t=2.41) - but by a tiny amount. FWIW this benchmarking code can be seen at: https://github.com/davidmalcolm/gcc-build/commit/4581f080be2ed92179bfc1bc12e0ba9e923adaae Data and URLs to graphs follow: kdecore.cc -O3: usr control: [63.01, 63.11, 62.9, 63.14, 63.22, 63.0, 63.21, 63.03, 63.07, 63.09] experiment: [62.94, 63.05, 62.99, 63.03, 63.25, 62.91, 63.14, 62.96, 63.0, 63.04] Min: 62.900000 -> 62.910000: 1.00x slower Avg: 63.078000 -> 63.031000: 1.00x faster Not significant Stddev: 0.09852 -> 0.10049: 1.0200x larger Timeline: http://goo.gl/8P9mFr kdecore.cc -O3: sys control: [5.85, 5.74, 5.91, 5.79, 5.79, 5.85, 5.81, 5.9, 5.83, 5.8] experiment: [5.8, 5.81, 5.88, 5.88, 5.77, 5.87, 5.69, 5.8, 5.75, 5.84] Min: 5.740000 -> 5.690000: 1.01x faster Avg: 5.827000 -> 5.809000: 1.00x faster Not significant Stddev: 0.05229 -> 0.06154: 1.1769x larger Timeline: http://goo.gl/xfWoUL kdecore.cc -O3: wall control: [69.01, 69.01, 68.94, 69.03, 69.13, 68.99, 69.18, 69.07, 69.01, 69.01] experiment: [68.88, 68.97, 68.99, 69.03, 69.12, 68.88, 68.96, 68.86, 68.85, 68.99] Min: 68.940000 -> 68.850000: 1.00x faster Avg: 69.038000 -> 68.953000: 1.00x faster Not significant Stddev: 0.07052 -> 0.08616: 1.2217x larger Timeline: http://goo.gl/Pf9BL8 kdecore.cc -O3: ggc control: [988589.0, 988591.0, 988585.0, 988582.0, 988584.0, 988589.0, 988585.0, 988582.0, 988584.0, 988586.0] experiment: [988593.0, 988585.0, 988589.0, 988585.0, 988591.0, 988588.0, 988591.0, 988585.0, 988589.0, 988590.0] Mem max: 988591.000 -> 988593.000: 1.0000x larger Usage over time: http://goo.gl/4pxB8Y big-code.c -O3: usr control: [60.06, 60.31, 60.33, 60.29, 60.28, 60.26, 60.41, 60.28, 60.29, 60.29] experiment: [59.9, 60.14, 60.27, 60.33, 60.22, 60.33, 60.31, 60.33, 60.16, 60.26] Min: 60.060000 -> 59.900000: 1.00x faster Avg: 60.280000 -> 60.225000: 1.00x faster Not significant Stddev: 0.08781 -> 0.13360: 1.5215x larger Timeline: http://goo.gl/gkFbgi big-code.c -O3: sys control: [1.32, 1.37, 1.34, 1.41, 1.35, 1.34, 1.29, 1.36, 1.33, 1.35] experiment: [1.38, 1.35, 1.32, 1.32, 1.33, 1.33, 1.34, 1.32, 1.41, 1.35] Min: 1.290000 -> 1.320000: 1.02x slower Avg: 1.346000 -> 1.345000: 1.00x faster Not significant Stddev: 0.03169 -> 0.02953: 1.0731x smaller Timeline: http://goo.gl/DPijei big-code.c -O3: wall control: [61.48, 61.78, 61.78, 61.8, 61.74, 61.71, 61.81, 61.73, 61.73, 61.74] experiment: [61.39, 61.58, 61.7, 61.75, 61.65, 61.76, 61.76, 61.76, 61.67, 61.72] Min: 61.480000 -> 61.390000: 1.00x faster Avg: 61.730000 -> 61.674000: 1.00x faster Not significant Stddev: 0.09393 -> 0.11587: 1.2337x larger Timeline: http://goo.gl/0WcMNl big-code.c -O3: ggc control: [575294.0, 575294.0, 575294.0, 575294.0, 575294.0, 575294.0, 575294.0, 575294.0, 575294.0, 575294.0] experiment: [575294.0, 575294.0, 575294.0, 575294.0, 575294.0, 575294.0, 575294.0, 575294.0, 575294.0, 575294.0] Mem max: 575294.000 -> 575294.000: no change Usage over time: http://goo.gl/jO0Enh