Hello, This series does some longstanding cleaning I've been meaning to do in the i965 state upload code. The distinction between BRW_NEW_* and CACHE_NEW_* flags has been pretty arbitrary for a while - 10/17 of them were for things we stopped caching years ago. So, I moved those to be BRW_NEW_* bits, and combined a bunch of redundant ones while I was at it.
Patches 1-6 move non-cache-related things out of .cache, along with other tidying. This actually could save up to 160 bytes of memory per context (on 64-bit), because cache types have auxiliary compare and free function pointers...which weren't used at all for these. (I haven't actually measured this - just eliminated the fields). Patches 7-10 take it a step further, and kill off the "cache" bitset altogether. A while back, I was looking at callgrind graphs for Glamor, trying to reduce brw_state_upload costs. One of the places where I saw cycles being wasted was in check_state(), which sees if each atom needs to be emitted. Eliminating "cache" should eliminate 1/4 of the cycles spent there, and every little bit helps. I also like the new names - BRW_NEW_VERTEX_PROGRAM vs CACHE_NEW_VS_PROG was always confusing - which is which, and why should I use one or the other? BRW_NEW_VS_PROG_DATA is clearly tied to brw_vs_prog_data. No regressions on 965, GM45, Ironlake, Sandybridge GT1/2, Ivybridge GT1/2, or Haswell GT3e. I really should check Broadwell before pushing, but haven't yet. This is available as the 'state-kill-cache' branch of my tree. It depends on the ddx/ddy cleanups I sent yesterday. --Ken _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev