On Fri, Apr 17, 2015 at 1:21 PM, Timothy Arceri <t_arc...@yahoo.com.au> wrote: > Hi all, > > Last year I spent a whole bunch of time profiling Mesa looking for areas > where improvements could be made. Anyway I thought I'd point out a > couple of things, and see if anyone thinks these are worthwhile > following up. > > 1. While the hash table has been getting a lot of attention lately, > after running the TF2 benchmark one place that showed up as using more > cpu than the hash table was the glsl parser. I guess this can be mostly > solved once mesa has a disk cache for shaders. > > But something I came across at the time was this paper describing > modifying (with apparently little effort) bison to generate a hardcoded > parser that 2.5-6.5 times faster will generating a slightly bigger > binary [1]. > > The resulting project has been lost in the sands of time unfortunately > so I couldn't try it out. > > 2. On most of the old quake engine benchmarks the Intel driver spends > between 3-4.5% of its time or 400 million calls to glib since this code > can't be inlined in this bit of code from copy_array_to_vbo_array(): > > while (count--) { > memcpy(dst, src, dst_stride); > src += src_stride; > dst += dst_stride; > } > > I looked in other drivers but I couldn't see them doing this kind of > thing. I'd imaging because of its nature this code could be a bottle > neck. Is there any easy ways to avoid doing this type of copy? Or would > the only thing possible be to write a complex optimisation?
Yeah, other drivers don't do this. In Gallium, we don't change the stride when uploading buffers, so in our case src_stride == dst_stride. Marek _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev