Re: gcc will become the best optimizing x86 compiler -> PPC case

2008-08-06 Thread Etienne Lorrain
---Andrew Pinski <[EMAIL PROTECTED]> wrote: > <[EMAIL PROTECTED]> wrote: > > The PPC has a very fast dcbz (data cache block zero) to clear memory, > > and also dcbi (data cache block invalidate) which permit to have a > > cached line caching an address without reading first the memory (when > > yo

Re: gcc will become the best optimizing x86 compiler -> PPC case

2008-08-02 Thread Andrew Pinski
On Fri, Aug 1, 2008 at 9:24 AM, Etienne Lorrain <[EMAIL PROTECTED]> wrote: > The PPC has a very fast dcbz (data cache block zero) to clear memory, > and also dcbi (data cache block invalidate) which permit to have a > cached line caching an address without reading first the memory (when > you plan

Re: gcc will become the best optimizing x86 compiler -> PPC case

2008-08-01 Thread Etienne Lorrain
> You forgot to look at PowerPC : > > http://cvs.opensolaris.org/source/xref/ppc-dev/ppc-dev/usr/src/lib/libc/ppc/gen/memcpy.s > > is that nice and small ? I had to clear/check the whole 256 Mbytes SDRAM on a PPC system, and the fastest way I got (excluding DMA access) is by playing with the laye

Re: gcc will become the best optimizing x86 compiler

2008-07-31 Thread Denys Vlasenko
On Thursday 31 July 2008 11:36, Dave Korn wrote: > Agner Fog wrote on 31 July 2008 07:14: > > > Denys Vlasenko wrote: > >> I tend to doubt that odd-byte aligned large memcpys are anywhere > >> near typical. malloc and mmap both return well-aligned buffers > >> (say, 8 byte aligned). Static and on-

RE: gcc will become the best optimizing x86 compiler

2008-07-31 Thread Dave Korn
Agner Fog wrote on 31 July 2008 07:14: > Denys Vlasenko wrote: >> I tend to doubt that odd-byte aligned large memcpys are anywhere >> near typical. malloc and mmap both return well-aligned buffers >> (say, 8 byte aligned). Static and on-stack objects are also >> at least word-aligned 99% of the ti

Re: gcc will become the best optimizing x86 compiler

2008-07-30 Thread Agner Fog
Denys Vlasenko wrote: I tend to doubt that odd-byte aligned large memcpys are anywhere near typical. malloc and mmap both return well-aligned buffers (say, 8 byte aligned). Static and on-stack objects are also at least word-aligned 99% of the time. memcpy can just use "relatively simple" code fo

Re: gcc will become the best optimizing x86 compiler

2008-07-30 Thread Christopher Faylor
On Tue, Jul 29, 2008 at 04:14:49PM +1000, Ben Elliston wrote: >> Since there is no libc mailing list, I thought that the gcc list is the >> place to contact the maintainers of libc. Am I on the wrong list? Or are >> there no maintainers of libc? > >See: > http://sources.redhat.com/glibc/ > >You

Re: gcc will become the best optimizing x86 compiler

2008-07-30 Thread Denys Vlasenko
On Wednesday 30 July 2008 19:14, Agner Fog wrote: > I agree that the OpenSolaris memcpy is bigger than necessary. However, > it is necessary to have 16 branches for covering all possible alignments > modulo 16. This is because, unfortunately, there is no XMM shift > instruction with a variable c

Re: gcc will become the best optimizing x86 compiler

2008-07-30 Thread Dennis Clarke
On Wed, Jul 30, 2008 at 5:14 PM, Agner Fog <[EMAIL PROTECTED]> wrote: > Denys Vlasenko wrote: >>> >>> 3164 line source file which implements memcpy(). >>> You got to be kidding. >>> How much of L1 icache it blows away in the process? >>> I bet it performs wonderfully on microbenchmarks though. >>>

Re: gcc will become the best optimizing x86 compiler

2008-07-30 Thread Agner Fog
Denys Vlasenko wrote: 3164 line source file which implements memcpy(). You got to be kidding. How much of L1 icache it blows away in the process? I bet it performs wonderfully on microbenchmarks though. I agree that the OpenSolaris memcpy is bigger than necessary. However, it is necessary t

Re: gcc will become the best optimizing x86 compiler

2008-07-30 Thread Denys Vlasenko
On Wed, Jul 30, 2008 at 5:57 PM, Denys Vlasenko <[EMAIL PROTECTED]> wrote: > On Fri, Jul 25, 2008 at 9:08 AM, Agner Fog <[EMAIL PROTECTED]> wrote: >> Raksit Ashok wrote: >>>There is a more optimized version for 64-bit: >>>http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/amd64/

Re: gcc will become the best optimizing x86 compiler

2008-07-30 Thread Denys Vlasenko
On Fri, Jul 25, 2008 at 9:08 AM, Agner Fog <[EMAIL PROTECTED]> wrote: > Raksit Ashok wrote: >>There is a more optimized version for 64-bit: >>http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/amd64/gen/memcpy.s >>I think this looks similar to your implementation, Agner. > > Yes

Re: gcc will become the best optimizing x86 compiler

2008-07-30 Thread Dennis Clarke
On Wed, Jul 30, 2008 at 3:23 PM, Eus <[EMAIL PROTECTED]> wrote: > Hi Ho! > > --- On Tue, 7/29/08, "Dennis Clarke" <[EMAIL PROTECTED]> wrote: > >> hold on .. on the NEWS page I see ... okay .. how very user friendly. >> Sort of the thing one would put on the project homepage I would think. > > Do yo

Re: gcc will become the best optimizing x86 compiler

2008-07-30 Thread Eus
Hi Ho! --- On Tue, 7/29/08, "Dennis Clarke" <[EMAIL PROTECTED]> wrote: > hold on .. on the NEWS page I see ... okay .. how very user friendly. > Sort of the thing one would put on the project homepage I would think. Do you mind to tell me what you saw? I was looking for the interesting part on t

Re: gcc will become the best optimizing x86 compiler

2008-07-29 Thread Tim Prince
Agner Fog wrote: Michael Matz wrote: You must be doing something wrong. If the compiler decides to inline the string ops it either knows the size or you told it to do it anyway (-minline-all-stringops or -minline-stringops-dynamically). In both cases will it use wider than byte moves when po

Re: gcc will become the best optimizing x86 compiler

2008-07-29 Thread Michael Matz
Hi, On Tue, 29 Jul 2008, Agner Fog wrote: > g++ (v. 4.2.3) without any options converts memcpy with unknown size to rep > movsb Use newer GCCs. They will (1) not expand memcpy inline for unknown sizes (without special options, also make sure you don't get the glibc inlines) and (2) won't exp

Re: gcc will become the best optimizing x86 compiler

2008-07-29 Thread Joseph S. Myers
On Tue, 29 Jul 2008, Steven Bosscher wrote: > On Tue, Jul 29, 2008 at 11:26 AM, Richard Guenther > <[EMAIL PROTECTED]> wrote: > >> g++ (v. 4.2.3) without any options converts memcpy with unknown size to > >> rep > >> movsb > > > > Make sure to use -D__NO_STRING_INLINES to not get glibcs inline >

Re: gcc will become the best optimizing x86 compiler

2008-07-29 Thread Steven Bosscher
On Tue, Jul 29, 2008 at 11:26 AM, Richard Guenther <[EMAIL PROTECTED]> wrote: >> g++ (v. 4.2.3) without any options converts memcpy with unknown size to rep >> movsb > > Make sure to use -D__NO_STRING_INLINES to not get glibcs inline > implementation. Why is this not the default? Gr. Steven

Re: gcc will become the best optimizing x86 compiler

2008-07-29 Thread Richard Guenther
On Tue, Jul 29, 2008 at 7:26 AM, Agner Fog <[EMAIL PROTECTED]> wrote: > Michael Matz wrote: >> >> You must be doing something wrong. If the compiler decides to inline the >> string ops it either knows the size or you told it to do it anyway >> (-minline-all-stringops or -minline-stringops-dynamica

Re: gcc will become the best optimizing x86 compiler

2008-07-28 Thread Ben Elliston
> Since there is no libc mailing list, I thought that the gcc list is the > place to contact the maintainers of libc. Am I on the wrong list? Or are > there no maintainers of libc? See: http://sources.redhat.com/glibc/ You want the libc-alpha list, I think. Cheers, Ben

Re: gcc will become the best optimizing x86 compiler

2008-07-28 Thread Agner Fog
Gerald Pfeifer wrote: See how user friendly we in GCC-land are in comparison? ;-) Since there is no libc mailing list, I thought that the gcc list is the place to contact the maintainers of libc. Am I on the wrong list? Or are there no maintainers of libc?

Re: gcc will become the best optimizing x86 compiler

2008-07-28 Thread Agner Fog
Michael Matz wrote: You must be doing something wrong. If the compiler decides to inline the string ops it either knows the size or you told it to do it anyway (-minline-all-stringops or -minline-stringops-dynamically). In both cases will it use wider than byte moves when possible. g++ (v.

Re: gcc will become the best optimizing x86 compiler

2008-07-28 Thread Gerald Pfeifer
On Mon, 28 Jul 2008, Dennis Clarke wrote: > hold on .. on the NEWS page I see ... okay .. how very user friendly. > Sort of the thing one would put on the project homepage I would think. See how user friendly we in GCC-land are in comparison? ;-) Gerald

Re: gcc will become the best optimizing x86 compiler

2008-07-28 Thread Dennis Clarke
On Mon, Jul 28, 2008 at 2:30 PM, Dave Korn <[EMAIL PROTECTED]> wrote: > Dennis Clarke wrote on 28 July 2008 18:54: > >> On Mon, Jul 28, 2008 at 1:17 PM, Paolo Carlini <[EMAIL PROTECTED]> >> wrote: >>> Dennis Clarke wrote: also, IMO, the NEWS sections says nothing useful to any human.

RE: gcc will become the best optimizing x86 compiler

2008-07-28 Thread Dave Korn
Dennis Clarke wrote on 28 July 2008 18:54: > On Mon, Jul 28, 2008 at 1:17 PM, Paolo Carlini <[EMAIL PROTECTED]> > wrote: >> Dennis Clarke wrote: >>> >>> also, IMO, the NEWS sections says nothing useful to any human. >>> >> >> but, *some* humans like to click on the first (download) link on top

Re: gcc will become the best optimizing x86 compiler

2008-07-28 Thread Ian Lance Taylor
"Dennis Clarke" <[EMAIL PROTECTED]> writes: > hold on .. on the NEWS page I see ... okay .. how very user friendly. > Sort of the thing one would put on the project homepage I would think. The glibc project has their own special approach to user friendliness. Ian

Re: gcc will become the best optimizing x86 compiler

2008-07-28 Thread Dennis Clarke
On Mon, Jul 28, 2008 at 1:17 PM, Paolo Carlini <[EMAIL PROTECTED]> wrote: > Dennis Clarke wrote: >> >> also, IMO, the NEWS sections says nothing useful to any human. >> > > but, *some* humans like to click on the first (download) link on top. where ? It says Availability The releases are availab

Re: gcc will become the best optimizing x86 compiler

2008-07-28 Thread Paolo Carlini
Dennis Clarke wrote: also, IMO, the NEWS sections says nothing useful to any human. but, *some* humans like to click on the first (download) link on top. Paolo.

Re: gcc will become the best optimizing x86 compiler

2008-07-28 Thread Dennis Clarke
On Mon, Jul 28, 2008 at 8:10 AM, Daniel Jacobowitz <[EMAIL PROTECTED]> wrote: > On Mon, Jul 28, 2008 at 12:56:57PM +0200, Agner Fog wrote: >> >2008/7/26 Agner Fog <[EMAIL PROTECTED]>: >> >>I have libc version 2.7. Can't find version 2.8 >> >It's in Fedora 9, I have no idea why the source isn't dire

Re: gcc will become the best optimizing x86 compiler

2008-07-28 Thread Michael Matz
Hi, On Mon, 28 Jul 2008, Agner Fog wrote: > Glibc 2.8 is still almost 5 times slower than the best function > libraries for unaligned data on Intel Core 2, and the default builtin > function is slower than any other implementation I have seen (copies 1 > byte at a time!). You must be doing so

Re: gcc will become the best optimizing x86 compiler

2008-07-28 Thread Daniel Jacobowitz
On Mon, Jul 28, 2008 at 12:56:57PM +0200, Agner Fog wrote: > >2008/7/26 Agner Fog <[EMAIL PROTECTED]>: > >>I have libc version 2.7. Can't find version 2.8 > >It's in Fedora 9, I have no idea why the source isn't directly > >available from the glibc homepage. > > 2.8 is not an official final release

Re: gcc will become the best optimizing x86 compiler

2008-07-28 Thread Agner Fog
Michael Meissner wrote: >Memcpy/memset optimizations were added to glibc 2.8, though when your favorite >distribution will provide it is a different question: >http://sourceware.org/ml/libc-alpha/2008-04/msg00050.html I finally got a SUSE with glibc 2.8. I can see that 32-bit memcpy has been m

Re: gcc will become the best optimizing x86 compiler

2008-07-28 Thread Andrew Haley
Agner Fog wrote: > Basile STARYNKEVITCH wrote: >>At last, at the recent (july 2008) GCC summit, someone (sorry I forgot > who, probably someone from SuSE) >> proposed in a BOFS to have architecture and machine specific > hand-tuned (or even hand-written assembly) low >> level libraries for such bas

Re: gcc will become the best optimizing x86 compiler

2008-07-26 Thread Agner Fog
Michael Meissner wrote: On Fri, Jul 25, 2008 at 09:08:42AM +0200, Agner Fog wrote: Gnu libc could borrow a lot of optimized functions from Opensolaris and Mac and other open source projects. They look better than Gnu libc, but there is still room for improvement. For example, Opensolaris d

Re: gcc will become the best optimizing x86 compiler

2008-07-25 Thread Michael Meissner
On Fri, Jul 25, 2008 at 09:08:42AM +0200, Agner Fog wrote: > Raksit Ashok wrote: > >There is a more optimized version for 64-bit: > >http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/amd64/gen/memcpy.s > >I think this looks similar to your implementation, Agner. > > Yes it is s

Re: gcc will become the best optimizing x86 compiler

2008-07-25 Thread Agner Fog
Raksit Ashok wrote: >There is a more optimized version for 64-bit: >http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/amd64/gen/memcpy.s >I think this looks similar to your implementation, Agner. Yes it is similar to my code. Gnu libc could borrow a lot of optimized function

Re: gcc will become the best optimizing x86 compiler

2008-07-24 Thread Raksit Ashok
On Thu, Jul 24, 2008 at 1:03 AM, Agner Fog <[EMAIL PROTECTED]> wrote: > Dennis Clarke wrote: >>The Sun Studio 12 compiler with Solaris 10 on AMD Opteron or >>UltraSparc beats GCC in almost every single test case that I have >>seen. > > This is memcpy on Solaris: > http://src.opensolaris.org/source/

Re: gcc will become the best optimizing x86 compiler

2008-07-24 Thread Basile STARYNKEVITCH
Joseph S. Myers wrote: On Thu, 24 Jul 2008, Basile STARYNKEVITCH wrote: At last, at the recent (july 2008) GCC summit, someone (sorry I forgot who, probably someone from SuSE) proposed in a BOFS to have architecture and machine specific hand-tuned (or even hand-written assembly) low level libra

Re: gcc will become the best optimizing x86 compiler

2008-07-24 Thread Agner Fog
Basile STARYNKEVITCH wrote: >At last, at the recent (july 2008) GCC summit, someone (sorry I forgot who, probably someone from SuSE) > proposed in a BOFS to have architecture and machine specific hand-tuned (or even hand-written assembly) low > level libraries for such basic things as memset et

Re: gcc will become the best optimizing x86 compiler

2008-07-24 Thread Agner Fog
Joseph S. Myers wrote: >I don't know if it was proposed in this context, but the ARM EABI has >various __aeabi_mem* functions for calls known to have particular >alignment and the idea is relevant to other platforms if you provide such >functions with the compiler. The compiler could also generat

Re: gcc will become the best optimizing x86 compiler

2008-07-24 Thread Joseph S. Myers
On Thu, 24 Jul 2008, Basile STARYNKEVITCH wrote: > At last, at the recent (july 2008) GCC summit, someone (sorry I forgot who, > probably someone from SuSE) proposed in a BOFS to have architecture and > machine specific hand-tuned (or even hand-written assembly) low level > libraries for such basi

Re: gcc will become the best optimizing x86 compiler

2008-07-24 Thread Richard Guenther
On Thu, Jul 24, 2008 at 3:28 PM, Agner Fog <[EMAIL PROTECTED]> wrote: > Basile STARYNKEVITCH wrote: >>At last, at the recent (july 2008) GCC summit, someone (sorry I forgot who, >> probably someone from SuSE) That was me and Michael Matz. Richard.

RE: gcc will become the best optimizing x86 compiler

2008-07-24 Thread Dave Korn
Basile STARYNKEVITCH wrote on 24 July 2008 11:28: > On most Linux systems, in addition of using the package manager, the > libc.so file is executable, and when executed, shows info, so on my > Debian/Sid/AMD64 I'm getting > > % /lib/libc.so.6 > GNU C Library stable release version 2.7, by Rolan

Re: gcc will become the best optimizing x86 compiler

2008-07-24 Thread Basile STARYNKEVITCH
Dave Korn wrote: Agner Fog wrote on 24 July 2008 09:04: Tim Prince wrote: >you identify the library you tested only as "ubuntu g++ 4.2.3." Where can I see the libc version? Use whichever package manager ubuntu provides to check the version of the glibc package. Here's an example fron a ce

RE: gcc will become the best optimizing x86 compiler

2008-07-24 Thread Dave Korn
Agner Fog wrote on 24 July 2008 09:04: > Tim Prince wrote: > >you identify the library you tested only as "ubuntu g++ 4.2.3." > Where can I see the libc version? Use whichever package manager ubuntu provides to check the version of the glibc package. Here's an example fron a centos (using rpm

Re: gcc will become the best optimizing x86 compiler

2008-07-24 Thread Zoltán Kócsi
> [...] > I have made a few optimized functions myself and published them as a > multi-platform library (www.agner.org/optimize/asmlib.zip). It is > faster than most other libraries on an Intel Core2 and up to ten > times faster than gcc using builtin functions. My library is > published with GPL

Re: gcc will become the best optimizing x86 compiler

2008-07-24 Thread Agner Fog
Dennis Clarke wrote: >The Sun Studio 12 compiler with Solaris 10 on AMD Opteron or >UltraSparc beats GCC in almost every single test case that I have >seen. This is memcpy on Solaris: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/i386/gen/memcpy.s It uses exactly the sam

Re: gcc will become the best optimizing x86 compiler

2008-07-23 Thread Dennis Clarke
On Wed, Jul 23, 2008 at 12:15 PM, Agner Fog <[EMAIL PROTECTED]> wrote: > Hi, I am doing research on optimization of microprocessors and compilers. > Some of you already know my optimization manuals (www.agner.org/optimize/). Sorry but I'm not buying. The Sun Studio 12 compiler with Solaris 10 on

Re: gcc will become the best optimizing x86 compiler

2008-07-23 Thread Tim Prince
Agner Fog wrote: I have tested a few of the most important functions in libc and compared them with other available libraries (MS, Borland, Intel, Mac). The comparison does not look good for gnu libc. See my test results in http://www.agner.org/optimize/optimizing_cpp.pdf section 2.6. As far