Chris Fairles wrote:

Kristian Poul Herkild wrote:

Robert Crawford wrote:

On Sun December 4 2005 4:11 am, Kristian Poul Herkild wrote:



-mfpmath=sse is not a good idea, the consensus is it actually lowers performance. -msse -mmmx -m3dnow are redundant (implied by -march=athlon-xp), and should be removed from your cflags line, but SHOULD be placed in your USE= line, wthout the - sign, like this:

USE="mmx 3dnow sse"

If you use gcc-3.4.4, these flags should work fine (I've used them for a long time- no problems).

CFLAGS="-march=athlon-xp -O3 -pipe -fomit-frame-pointer -fweb -ftracer -fprefetch-loop-arrays -ffast-math -falign-functions=64 -fno-ident"

CXXFLAGS="${CFLAGS} -fvisibility-inlines-hidden"



Hmm... according to this thread http://forums.gentoo.org/viewtopic.php?t=43648 and the GCC manual -march does not imply -mmx -msse -m3dnow, nor does it imply mfpmath=sse. I know of no consensus of -mfpmath=sse lowering performance. Actually, I only know of the opposite from the LFS-community as well as Gentoo Wiki.

I don't want to start a flamewar on this, so if you have other and more correct information than me, then please share it :)

-Kristian Poul Herkild


Straight from the source  ../gcc-3.4.4/gcc/config/i386/i386.c

{"athlon-xp", PROCESSOR_ATHLON, PTA_MMX | PTA_PREFETCH_SSE | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE} includes mmx, sse, 3dnow and sse2 support making explicit delcarations in cflags redundant.

and the others...

     {"i386", PROCESSOR_I386, 0},
     {"i486", PROCESSOR_I486, 0},
     {"i586", PROCESSOR_PENTIUM, 0},
     {"pentium", PROCESSOR_PENTIUM, 0},
     {"pentium-mmx", PROCESSOR_PENTIUM, PTA_MMX},
     {"winchip-c6", PROCESSOR_I486, PTA_MMX},
     {"winchip2", PROCESSOR_I486, PTA_MMX | PTA_3DNOW},
     {"c3", PROCESSOR_I486, PTA_MMX | PTA_3DNOW},
{"c3-2", PROCESSOR_PENTIUMPRO, PTA_MMX | PTA_PREFETCH_SSE | PTA_SSE},
     {"i686", PROCESSOR_PENTIUMPRO, 0},
     {"pentiumpro", PROCESSOR_PENTIUMPRO, 0},
     {"pentium2", PROCESSOR_PENTIUMPRO, PTA_MMX},
{"pentium3", PROCESSOR_PENTIUMPRO, PTA_MMX | PTA_SSE | PTA_PREFETCH_SSE}, {"pentium3m", PROCESSOR_PENTIUMPRO, PTA_MMX | PTA_SSE | PTA_PREFETCH_SSE}, {"pentium-m", PROCESSOR_PENTIUMPRO, PTA_MMX | PTA_SSE | PTA_PREFETCH_SSE | PTA_SSE2}, {"pentium4", PROCESSOR_PENTIUM4, PTA_SSE | PTA_SSE2 | PTA_MMX | PTA_PREFETCH_SSE}, {"pentium4m", PROCESSOR_PENTIUM4, PTA_SSE | PTA_SSE2 | PTA_MMX | PTA_PREFETCH_SSE}, {"prescott", PROCESSOR_PENTIUM4, PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_MMX | PTA_PREFETCH_SSE}, {"nocona", PROCESSOR_PENTIUM4, PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_64BIT | PTA_MMX | PTA_PREFETCH_SSE},
     {"k6", PROCESSOR_K6, PTA_MMX},
     {"k6-2", PROCESSOR_K6, PTA_MMX | PTA_3DNOW},
     {"k6-3", PROCESSOR_K6, PTA_MMX | PTA_3DNOW},
     {"athlon", PROCESSOR_ATHLON, PTA_MMX | PTA_PREFETCH_SSE | PTA_3DNOW
                                  | PTA_3DNOW_A},
     {"athlon-tbird", PROCESSOR_ATHLON, PTA_MMX | PTA_PREFETCH_SSE
                                        | PTA_3DNOW | PTA_3DNOW_A},
{"athlon-4", PROCESSOR_ATHLON, PTA_MMX | PTA_PREFETCH_SSE | PTA_3DNOW
                                   | PTA_3DNOW_A | PTA_SSE},
{"athlon-xp", PROCESSOR_ATHLON, PTA_MMX | PTA_PREFETCH_SSE | PTA_3DNOW
                                     | PTA_3DNOW_A | PTA_SSE},
{"athlon-mp", PROCESSOR_ATHLON, PTA_MMX | PTA_PREFETCH_SSE | PTA_3DNOW
                                     | PTA_3DNOW_A | PTA_SSE},
     {"x86-64", PROCESSOR_K8, PTA_MMX | PTA_PREFETCH_SSE | PTA_64BIT
                              | PTA_SSE | PTA_SSE2 },
{"k8", PROCESSOR_K8, PTA_MMX | PTA_PREFETCH_SSE | PTA_3DNOW | PTA_64BIT
                                     | PTA_3DNOW_A | PTA_SSE | PTA_SSE2},
{"opteron", PROCESSOR_K8, PTA_MMX | PTA_PREFETCH_SSE | PTA_3DNOW | PTA_64BIT
                                     | PTA_3DNOW_A | PTA_SSE | PTA_SSE2},
{"athlon64", PROCESSOR_K8, PTA_MMX | PTA_PREFETCH_SSE | PTA_3DNOW | PTA_64BIT
                                     | PTA_3DNOW_A | PTA_SSE | PTA_SSE2}


cheers,
chris

I found another neat trick to find out what a flag sets/unsets.

echo "main(){}" > foo.c
gcc -v -Q -march=athlon-xp foo.c

churns out:

options passed:  -v -march=athlon-xp -auxbase
options enabled:  -feliminate-unused-debug-types -fpeephole -ffunction-cse
-fkeep-static-consts -fpcc-struct-return -fgcse-lm -fgcse-sm -fgcse-las
-fsched-interblock -fsched-spec -fsched-stalled-insns
-fsched-stalled-insns-dep -fbranch-count-reg -fcommon -fargument-alias
-fzero-initialized-in-bss -fident -fmath-errno -ftrapping-math -m80387
-mhard-float -mno-soft-float -mieee-fp -mfp-ret-in-387
-maccumulate-outgoing-args -mmmx -m3dnow -msse -mno-red-zone
-mtls-direct-seg-refs -mtune=athlon-xp -march=athlon-xp

replace -march=athlon-xp with any/all cflags you can think of to see what it sets

cheers,
chris



--
gentoo-user@gentoo.org mailing list

Reply via email to