On 11/04/2014 01:24 PM, Roland Scheidegger wrote: > Am 04.11.2014 um 13:05 schrieb Juha-Pekka Heikkila: >> + for(i = 0; i < n; i++) { >> + _mesa_clamp_float_rgba(rgba_src[i], temp, min, max); >> + >> + *operand = _mm_mul_ps(multiplier, *operand); >> + truncated_integers = _mm_cvttps_epi32(*operand); >> + mmove = _mm_set_ps(aMap[map_p[ACOMP]], bMap[map_p[BCOMP]], >> + gMap[map_p[GCOMP]], rMap[map_p[RCOMP]] ); >> + >> + _mm_storeu_ps(rgba_dst[i], mmove); > The sse2 code at the end looks counterproductive to me. Not sure what > gcc will generate but I'd suspect it involves some simd->int domain > transition for the table lookups, plus another int->simd transition to > get the values back into simd domain (alternatively it might use > stores/load here) just so you can store them again... > It would probably be better to just store the values directly after the > table lookups. > But in any case actually I'm beginning to suspect noone really cares > about performance anyway for that path (who the hell uses these > scale/map features?) so whatever works...
Which raises another question... do we have any piglit tests that actually exercise this path? > Roland _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev