This patchset improves translate in several way, with a particular focus on making it more useful and faster for hardware drivers.
The first part of the patchset is a couple of simple changes: support for directly using memcpy() if the input and output formats are identical, and support for 8-bit and 16-bit indices in addition to 32-bit ones. The second part is a more ambitious and experimental rewrite of the translate_sse code, which adds x86-64 support and accelerates all useful format conversions (in particular, any swizzle with no conversion, and any conversion to float32, or to any 16-bit integers). Currently, translate_sse does not work on x86-64 at all, and only supports float32 and two unorm8 formats, both as input and output, making it unusable for compensating holes in GPU vertex format support, or for pushing vertices directly on the FIFO. Luca Barbieri (6): translate_generic: use memcpy if possible translate_generic: factor out common code between linear and indexed translate_sse: remove useless generated function wrappers translate: add support for 8/16-bit indices rtasm: add minimal x86-64 support and new instructions translate_sse: major rewrite src/gallium/auxiliary/rtasm/rtasm_cpu.c | 6 +- src/gallium/auxiliary/rtasm/rtasm_x86sse.c | 447 +++++++- src/gallium/auxiliary/rtasm/rtasm_x86sse.h | 67 +- src/gallium/auxiliary/translate/translate.h | 12 + .../auxiliary/translate/translate_generic.c | 200 ++-- src/gallium/auxiliary/translate/translate_sse.c | 1224 +++++++++++++++----- 6 files changed, 1536 insertions(+), 420 deletions(-) _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev