Hi, So i looked a bit more at what path we should try to optimize in the mesa/gallium/pipe infrastructure. Here are some number gathers from games : drawcall / ps constant vs constant ps sampler vs sampler doom3 1.45 1.39 9.24 9.86 nexuiz 6.27 5.98 6.84 7.30 openarena 2805.64 1.38 1.51 1.54
(value of 1 mean there is a call of this function for every draw call, while value of 10 means there is a call to this function every 10 draw call, average) Note that openarena ps constant number is understable as it's fixed GL pipeline which is in use here and the pixel shader constant doesn't need much change in those case. So i think clear trend is that there is a lot of constant upload and sampler changing (allmost at each draw call for some games) Thus i think we want to make sure that we have real fast path for uploading constant or changing sampler. I think those path should be change and should avoid using some of the gallium infrastructure. For shader constant i think best solution is to provide the ptr to program constant buffer directly to the pipe driver and let the driver choose how it wants to upload constant to the GPU (GPU have different capabilities, some can stream constant buffer inside their command stream, other can just keep around a pool of buffer into which they can memcpy, ...) As there is no common denominator i don't think we should go through the pipe buffer allocation and providing a new pipe buffer each time. Optimizing this for r600g allow ~7% increase in games (when draw is nop) ~5% (when not submitting to gpu) ~3% when no part of the driver is commented. r600g have others bottleneck that tends to minimize the gain we can get from such optimization. Patch at http://people.freedesktop.org/~glisse/gallium_const_path/ For sampler i don't think we want to create persistant object, we are spending precious time building, hashing, searching for similar sampler each time in the gallium code, i think best would be to think state as use once and forget. That said we can provide helper function to pipe driver that wants to be cache sampler (but even for virtual hw i don't think this makes sense). I haven't yet implemented a fast path for sampler to see how much we can win from that but i will report back once i do. So a more fundamental question here is should we move away from persistant state and consider all states (except shader and texture) as being too much volatile so that caching any of them doesn't make sense from performance point of view. That would mean change lot of create/bind/delete interface to simply set interface for the pipe driver. This could be seen as a simplification. Anyway i think we should really consider moving more toward set than create/bind/delete (i loved a lot the create/bind/delete paradigm but it doesn't seems to be the one you want with GL, at least from number i gather with some games). Cheers, Jerome Glisse _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev