----- Original Message ----- > > Am Freitag, 11. November 2011 14:33 CET, Michel Dänzer > <mic...@daenzer.net> schrieb: > > > On Fre, 2011-11-11 at 14:15 +0100, Theiss, Ingo wrote: > > > Am Freitag, 11. November 2011 12:09 CET, Michel Dänzer > > > <mic...@daenzer.net> schrieb: > > > > > > > So It makes sense to find a glReadPixels in VirtualGL's > > > > glxSwapBuffers. > > > > > > > > Ah. I thought the time measurements in Ingo's original post > > > > were for the > > > > Mesa glXSwapBuffers, not the VirtualGL one. If it's the latter, > > > > then > > > > > > > this makes sense. > > > > > > > > Ingo, I noticed that your 64-bit and 32-bit drivers were built > > > > from > > > > slightly different Git snapshots. Is the problem still the same > > > > if you > > > > build both from the same, current snapshot? > > > > > > > > If yes, have you compared the compiler flags that end up being > > > > used in > > > > both cases? E.g., in 64-bit mode SSE is always available, so > > > > there might > > > > be some auto-vectorization going on in that case. > > > > > > I´ve rebuild my 64-bit and 32-bit drivers from a fresh Git > > > snapshot > > > and turned on all processor optimizations in both builds. > > > But nevertheless the readback performance measured inside > > > VirtualGL is > > > only half of the 64-bit readback performance and of course the > > > rendered window sceene is noticeable slower to :-( > > > > > > Here are the compiler flags used. > > > > > > 32-bit: > > > > > > CFLAGS: -O2 -Wall -g -m32 -march=amdfam10 -mtune=amdfam10 > > > -fno-omit-frame-pointer -Wall -Wmissing-prototypes -std=c99 > > > -ffast-math -fno-strict-aliasing -fno-builtin-memcmp -m32 -O2 > > > -Wall -g -m32 -march=amdfam10 -mtune=amdfam10 -fno-omit > > > -frame-pointer -fPIC -m32 > > > > Have you tried adding -mfpmath=sse to the 32-bit CFLAGS? According > > to my > > gcc documentation, that option is enabled by default in 64-bit mode > > but > > disabled in 32-bit mode. > > > > Anyway, I guess there's room for optimization in glReadPixels... > > Ok I have added -mfpmath=sse to the 32-bit CFLAGS and the readback > performance increased from 30.44 Mpixels/sec to 48.92 Mpixel/sec. We > are getting closer to the 64-bit performance.
hmm. you should try -msse2 too. It's implied on 64bits, and I'm not sure if -march/-mfpmath=sse by itself will enable the intrinsics. Jose _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev