James, On Tue, Sep 9, 2014 at 10:31 AM, James Almer <jamr...@gmail.com> wrote:
> On 09/09/14 9:52 AM, Pascal Massimino wrote: > > + mova m2, m_sum > > +%if mmsize == 16 > > + psrldq m2, 4 > > + paddd m_sum, m2 > > + psrldq m2, 4 > > + paddd m_sum, m2 > > + psrldq m2, 4 > > + paddd m_sum, m2 > > +%else > > + psrlq m2, 32 > > + paddd m_sum, m2 > > +%endif > > The SSE2 version is using three instructions more than necessary here. > You could use the HADDD macro to replace the code above, which expands > to a more optimized SSE2 version. > > And now that i check the old stuff again, you could also use it in the > IDET_FILTER_LINE macro. It will be one less instruction for the mmxext > version. > oh, right! let me send you a patch for that... _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel