Hi, I am doing research on optimization of microprocessors and
compilers. Some of you already know my optimization manuals
(www.agner.org/optimize/).
I have tested many different compilers and compared how well they
optimize C++ code. I have been pleased to observe that gcc has been
improved a lot in the last couple of years. The gcc compiler itself is
now matching the optimizing performance of the Intel compiler and it
beats all other compilers I have tested. All you hard-working developers
deserve credit for this!
I can imagine that gcc might be the compiler of choice for all x86 and
x86-64 platforms in the future. Actually, the compiler itself is very
close to being the best, but it appears that the function libraries are
lacking behind. I have tested a few of the most important functions in
libc and compared them with other available libraries (MS, Borland,
Intel, Mac). The comparison does not look good for gnu libc. See my test
results in http://www.agner.org/optimize/optimizing_cpp.pdf section 2.6.
The 64-bit version is better than the 32-bit version, though.
The first thing that you can do to improve the performance is to drop
the builtin versions of memory and string functions. The speed can be
improved by up to a factor 5 in some cases by compiling with
-fno-builtin. The builtin version is never optimal, except for memcpy in
cases where the count is a small compile-time constant so that it can be
replaced by simple mov instructions.
Next, the function libraries should have CPU-dispatching and use the
latest instruction sets where appropriate. You are not even using XMM
registers for memcpy in 64-bit libc.
I think you can borrow code from the Mac/Darwin/Xnu project. They have
optimized these functions very carefully for the Intel Core and Core 2
processors. Of course they have the advantage that they don't need to
support any other processors, whereas gcc has to support every possible
Intel and AMD processor. This means more CPU-dispatching.
I have made a few optimized functions myself and published them as a
multi-platform library (www.agner.org/optimize/asmlib.zip). It is faster
than most other libraries on an Intel Core2 and up to ten times faster
than gcc using builtin functions. My library is published with GPL
license, but I will allow you to use my code in gnu libc if you wish
(Sorry, I don't have the time to work on the gnu project myself, but you
may contact me for details about the code).
The Windows version of gcc is not up to date, but I think that when gcc
gets a reputation as the best compiler, more people will be motivated to
update cygwin/mingw. A lot of people are actually using it.