Hi Michel, thanks for the reply and your suggestions.
It took me a while to figure out how to use and run oprofile but finally I was able to produce some hopefully useable output. The function calls of mesa/state_tracker/st_cb_readpixels.c:382 -> st_readpixels and mesa/main/pack.c:552 -> _mesa_pack_rgba_span_float clearly stands out when comparing the 32 bit and 64 bit profile. You can take a look at the complete reports and callgraph images at: http://www.i-matrixx.de/oreport_glxspheres64.txt https://www.i-matrixx.de/oprofile_glxspheres64.png https://www.i-matrixx.de/oreport_glxspheres32.txt https://www.i-matrixx.de/oprofile_glxspheres32.png I hope this helps to find the cause and improve the driver. To sad I have no knowledge in C programming this is getting interesting. Let me know if you need anything else. Thanks for your time. Regards, Ingo Am Montag, 07. November 2011 16:10 CET, Michel Dänzer <mic...@daenzer.net> schrieb: > On Fre, 2011-11-04 at 13:38 +0100, Theiss, Ingo wrote: > > > > I am using VirtualGL (http://www.virtualgl.org) for full 3D hardware > > accelerated remote OpenGL applications with latest mesa from git > > (compiled for both 32 bit and 64 bit) on my 64 bit Debian Wheezy box. > > > > When I run a 32 bit application with VirtualGL I suffer nearly 50% > > performance drop compared when running the same 64 bit application > > with virtualGL. In the first place I have contacted the VirtualGL > > developer and he said that the performance drop is not a VirtualGL > > problem but related to the underlying 3D driver. The performance drop > > seems related to the function glxSwapBuffers which can be seen in the > > function call tracing of VirtualGL: > > > > 64 bit application with VirtualGL > > ------------------------------------- > > [VGL] glXSwapBuffers (dpy=0x00deb900(:0) drawable=0x00a00002 > > pbw->getglxdrawable()=0x00800002 ) 28.770924 ms > > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 > > pbw->getglxdrawable()=0x00800002 ) 0.005960 ms > > [VGL] glViewport (x=0 y=0 width=1240 height=900 ) 0.003099 ms > > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 > > pbw->getglxdrawable()=0x00800002 ) 0.002861 ms > > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 > > pbw->getglxdrawable()=0x00800002 ) 0.002861 ms > > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 > > pbw->getglxdrawable()=0x00800002 ) 0.000000 ms > > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 > > pbw->getglxdrawable()=0x00800002 ) 0.000954 ms > > [VGL] glXSwapBuffers (dpy=0x00deb900(:0) drawable=0x00a00002 > > pbw->getglxdrawable()=0x00800002 ) 29.365063 ms > > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 > > pbw->getglxdrawable()=0x00800002 ) 0.006914 ms > > > > 32 bit application with VirtualGL > > ------------------------------------- > > [VGL] glXSwapBuffers (dpy=0x087f7458(:0.0) drawable=0x00a00002 > > pbw->getglxdrawable()=0x00800002 ) 65.419075 ms > > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 > > pbw->getglxdrawable()=0x00800002 ) 0.005930 ms > > [VGL] glViewport (x=0 y=0 width=1240 height=900 ) 0.003049 ms > > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 > > pbw->getglxdrawable()=0x00800002 ) 0.002989 ms > > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 > > pbw->getglxdrawable()=0x00800002 ) 0.004064 ms > > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 > > pbw->getglxdrawable()=0x00800002 ) 0.001051 ms > > [VGL] glPopAttrib (pbw->_dirty=0 pbw->_rdirty=0 > > pbw->getglxdrawable()=0x00800002 ) 0.001044 ms > > [VGL] glXSwapBuffers (dpy=0x087f7458(:0.0) drawable=0x00a00002 > > pbw->getglxdrawable()=0x00800002 ) 65.005891 ms > > [VGL] glDrawBuffer (mode=0x00000405 pbw->_dirty=0 pbw->_rdirty=0 > > pbw->getglxdrawable()=0x00800002 ) 0.004926 ms > > > > > > Is this performance drop a normal or expected behaviour when running a > > 32 bit application on 64 bit OS or some kind of "bug"? > > Probably the latter. You should try to find out where the time is spent > inside glXSwapBuffers in both cases. If the function is (at least > roughly) CPU bound, this should be relatively easy with a profiler such > as sysprof, perf or oprofile. > > > -- > Earthling Michel Dänzer | http://www.amd.com > Libre software enthusiast | Debian, X and DRI developer > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev