On Fre, 2011-11-11 at 06:52 -0800, Jose Fonseca wrote: > > ----- Original Message ----- > > > > Am Freitag, 11. November 2011 14:33 CET, Michel Dänzer > > <mic...@daenzer.net> schrieb: > > > > > On Fre, 2011-11-11 at 14:15 +0100, Theiss, Ingo wrote: > > > > > > > > Here are the compiler flags used. > > > > > > > > 32-bit: > > > > > > > > CFLAGS: -O2 -Wall -g -m32 -march=amdfam10 -mtune=amdfam10 > > > > -fno-omit-frame-pointer -Wall -Wmissing-prototypes -std=c99 > > > > -ffast-math -fno-strict-aliasing -fno-builtin-memcmp -m32 -O2 > > > > -Wall -g -m32 -march=amdfam10 -mtune=amdfam10 -fno-omit > > > > -frame-pointer -fPIC -m32 > > > > > > Have you tried adding -mfpmath=sse to the 32-bit CFLAGS? According > > > to my > > > gcc documentation, that option is enabled by default in 64-bit mode > > > but > > > disabled in 32-bit mode. > > > > > > Anyway, I guess there's room for optimization in glReadPixels... > > > > Ok I have added -mfpmath=sse to the 32-bit CFLAGS and the readback > > performance increased from 30.44 Mpixels/sec to 48.92 Mpixel/sec. We > > are getting closer to the 64-bit performance. > > hmm. you should try -msse2 too. It's implied on 64bits, and I'm not > sure if -march/-mfpmath=sse by itself will enable the intrinsics.
From my reading of the gcc docs, it's implied by -march=amdfam10 . -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev