On 09/09/14 9:52 AM, Pascal Massimino wrote: > + mova m2, m_sum > +%if mmsize == 16 > + psrldq m2, 4 > + paddd m_sum, m2 > + psrldq m2, 4 > + paddd m_sum, m2 > + psrldq m2, 4 > + paddd m_sum, m2 > +%else > + psrlq m2, 32 > + paddd m_sum, m2 > +%endif
The SSE2 version is using three instructions more than necessary here. You could use the HADDD macro to replace the code above, which expands to a more optimized SSE2 version. And now that i check the old stuff again, you could also use it in the IDET_FILTER_LINE macro. It will be one less instruction for the mmxext version. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel