Re: [Fwd: performance with gcc -O0/-O2]

2007-11-27 Thread Howard Chu
J.C. Pizarro wrote: For your Opteron, try with this option -O3 -fomit-frame-pointer -march=k8 -funroll-loops -finline-functions -fpeel-loops \ -mno-sse3 -msse2 -msse -mno-mmx -mno-3dnow The Opteron hardware said that it's better to use SSE2 than SSE3. The MMX and 3DNow!+ instructions are shorte

Re: [Fwd: performance with gcc -O0/-O2]

2007-11-27 Thread Andrew Haley
Andi Kleen writes: > Andrew Haley <[EMAIL PROTECTED]> writes: > > > Howard Chu writes: > > > > > A bit of a minor mystery. Not a problem, just a curiosity. If > > > someone knew off the top of their head a reason for it, that'd be > > > cool, but otherwise no sweat. > > > > It's possib

Re: [Fwd: performance with gcc -O0/-O2]

2007-11-27 Thread Howard Chu
Richard Guenther wrote: On Nov 27, 2007 2:23 PM, Howard Chu <[EMAIL PROTECTED]> wrote: A bit of a minor mystery. Not a problem, just a curiosity. If someone knew off the top of their head a reason for it, that'd be cool, but otherwise no sweat. I'd try -Os, you might run into ICache limitation

Re: [Fwd: performance with gcc -O0/-O2]

2007-11-27 Thread Andi Kleen
Andrew Haley <[EMAIL PROTECTED]> writes: > Howard Chu writes: > > > A bit of a minor mystery. Not a problem, just a curiosity. If > > someone knew off the top of their head a reason for it, that'd be > > cool, but otherwise no sweat. > > It's possible, although unlikley, that the optimized code

Re: [Fwd: performance with gcc -O0/-O2]

2007-11-27 Thread Tim Prince
Richard Guenther wrote: > On Nov 27, 2007 2:23 PM, Howard Chu <[EMAIL PROTECTED]> wrote: >> A bit of a minor mystery. Not a problem, just a curiosity. If someone knew >> off >> the top of their head a reason for it, that'd be cool, but otherwise no >> sweat. > > I'd try -Os, you might run into I

Re: [Fwd: performance with gcc -O0/-O2]

2007-11-27 Thread J.C. Pizarro
For your Opteron, try with this option -O3 -fomit-frame-pointer -march=k8 -funroll-loops -finline-functions -fpeel-loops \ -mno-sse3 -msse2 -msse -mno-mmx -mno-3dnow The Opteron hardware said that it's better to use SSE2 than SSE3. The MMX and 3DNow!+ instructions are shorter and older than SSE2/

Re: [Fwd: performance with gcc -O0/-O2]

2007-11-27 Thread Andrew Haley
Howard Chu writes: > A bit of a minor mystery. Not a problem, just a curiosity. If > someone knew off the top of their head a reason for it, that'd be > cool, but otherwise no sweat. It's possible, although unlikley, that the optimized code has worse cache behaviour. No way to know better wit

Re: [Fwd: performance with gcc -O0/-O2]

2007-11-27 Thread Richard Guenther
On Nov 27, 2007 2:23 PM, Howard Chu <[EMAIL PROTECTED]> wrote: > A bit of a minor mystery. Not a problem, just a curiosity. If someone knew off > the top of their head a reason for it, that'd be cool, but otherwise no sweat. I'd try -Os, you might run into ICache limitations. Richard. >