Hi, i thought, maybe it interests you. I have done some benchmarking to compare different gcc releases. Therefore, i have written a bash based benchmarksuite which compiles and benches mostly C-benchmarks.
Benchmark environment: The benchmarks are running on i686-pc-linux-gnu (gentoo based) system on a Athlon XP 2600+ Barton Core (512 Kbyte L2 Cache) with 1.9Ghz and 512 MB Ram. The benchmarks are all running with nice -n -20 for minimizing noise. Therefore, only very few processes are running while benchmarking (only kernel, agetty, bash, udev, login and syslog). The benchmarks: I'm using freebench-1.03, nbench-byte-2.2.2 and a selfmade lamebench-3.96.1 based on lame-3.96.1. In the lamebenchmark a wav-file is compressed to an mp3 file, the time is measured for this. The benchmark procedure: A bash script named startbenchrotation starts the given benchmarks with the following commandlines: startfreebench-1.03() { make distclean >/dev/null 2>&1 nice -n -20 make ref >/dev/null 2>&1 } startnbench-byte-2.2.2() { make mrproper >/dev/null 2>&1 make >/dev/null 2>&1 nice -n -20 ./nbench 2>/dev/null > ./resultfile.txt } startlamebench-3.96.1() { rm -rf lamebuild rm -f testfile.mp3 mkdir lamebuild || error "Couldn't mkdir lamebench" cd lamebuild ../lame-3.96.1/configure >/dev/null 2>&1 make >/dev/null 2>&1 START=`date +%s` nice -n -20 frontend/lame -m s --quiet -q 0 -b 128 --cbr ../testfile.wav ../testfile.mp3 END=`date +%s` cd .. echo "$((${END}-${START}))" >./resultfile.txt } Each benchmark is run with a combination of cflags. The cflags are composed of base-flags and testingflags, eg.: BASEFLAGS="-s -static -O3 -march=athlon-xp -fomit-frame-pointer -pipe" TESTINGFLAGS="-fforce-addr|-fsched-spec-load|-fmove-all-movables|-freduce-all-givs|-ffast-math|-ftrace r|-funroll-loops|-funroll-all-loops|-fprefetch-loop-arrays|-mfpmath=sse|-mfpmath=sse,387|-momit-leaf-frame-poi nter" '|' is used as a field-seperator. First, all flags from baseflags and testingflags are combined and the benchmark is started and the result is taken as the best result. Then, a flag from the testingflags is removed and the benchmark is repeated. If the arithmetic average of the results in the repeated benchmark is better than the arithmetic average of the best results so far, than the tested cflag is noted as worst flag. This is done for all flags in the testingflags and the worst flag of all is filtered out of the testingflags. After this, the above procedure is started again with the new testingflags without the filtered one. (This heuristical approach to compiler performance comparison was described on the gcc ml some month ago, "Compiler Optimization Orchestration For Peak Performance" by Zhelong Pan and Rudolf Eigenmann). The protagonists: The tested compilers were: gcc-3.3.6 (gentoo system compiler), gcc-3.4.4 (gentoo system compiler) and gcc-4.0.2 (FSF release). The results: All results are written as relativ performance measures. gcc-3.3.6 is taken as the basis for all relations. If x% > 0% then this means, that the given compiler generated code which was x% faster than the code from gcc-3.3.6. For x% <= 0% it means the generated code was slower. All relations base on the arithmetic average of all passes of a benchmark of the best achieved result. benchmark: gcc-3.3.6 gcc-3.4.4 gcc-4.0.2 freebench - +1% -5% nbench - +13% +11% lamebench - +1% +1% Conclusion: Well, this benchmarksuite is only meant for some comparisons between different compilers to estimate performance in real life applications. If you are interested in future benchmarks of newer releases, i could offer this service. If you think this is uninteresting, i won't send any benchmark measures anymore. I hope this will help track the performance of code generated by gcc and help gcc getting better in this afford. Constructiv critics is always welcomed. I hope you guys keep up your work on improving gcc. Thanks for reading, Ronny Peine