On 05.11.2014 21:21, Ian Romanick wrote: > On 11/04/2014 01:24 PM, Roland Scheidegger wrote: >> Am 04.11.2014 um 13:05 schrieb Juha-Pekka Heikkila: >>> + for(i = 0; i < n; i++) { >>> + _mesa_clamp_float_rgba(rgba_src[i], temp, min, max); >>> + >>> + *operand = _mm_mul_ps(multiplier, *operand); >>> + truncated_integers = _mm_cvttps_epi32(*operand); >>> + mmove = _mm_set_ps(aMap[map_p[ACOMP]], bMap[map_p[BCOMP]], >>> + gMap[map_p[GCOMP]], rMap[map_p[RCOMP]] ); >>> + >>> + _mm_storeu_ps(rgba_dst[i], mmove); >> The sse2 code at the end looks counterproductive to me. Not sure what >> gcc will generate but I'd suspect it involves some simd->int domain >> transition for the table lookups, plus another int->simd transition to >> get the values back into simd domain (alternatively it might use >> stores/load here) just so you can store them again... >> It would probably be better to just store the values directly after the >> table lookups. >> But in any case actually I'm beginning to suspect noone really cares >> about performance anyway for that path (who the hell uses these >> scale/map features?) so whatever works... > > Which raises another question... do we have any piglit tests that > actually exercise this path?
No we don't. I made small test for this to see how it works, I was planning to move my test to Piglit later. /Juha-Pekka _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev