Hi all, Last year I spent a whole bunch of time profiling Mesa looking for areas where improvements could be made. Anyway I thought I'd point out a couple of things, and see if anyone thinks these are worthwhile following up.
1. While the hash table has been getting a lot of attention lately, after running the TF2 benchmark one place that showed up as using more cpu than the hash table was the glsl parser. I guess this can be mostly solved once mesa has a disk cache for shaders. But something I came across at the time was this paper describing modifying (with apparently little effort) bison to generate a hardcoded parser that 2.5-6.5 times faster will generating a slightly bigger binary [1]. The resulting project has been lost in the sands of time unfortunately so I couldn't try it out. 2. On most of the old quake engine benchmarks the Intel driver spends between 3-4.5% of its time or 400 million calls to glib since this code can't be inlined in this bit of code from copy_array_to_vbo_array(): while (count--) { memcpy(dst, src, dst_stride); src += src_stride; dst += dst_stride; } I looked in other drivers but I couldn't see them doing this kind of thing. I'd imaging because of its nature this code could be a bottle neck. Is there any easy ways to avoid doing this type of copy? Or would the only thing possible be to write a complex optimisation? Tim [1] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.4539 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev