On 14/01/15 1:59 PM, Michael Niedermayer wrote: > On Wed, Jan 14, 2015 at 01:53:48AM -0300, James Almer wrote: >> unpack_2ch is already using sse float ops only, and pack_2ch is a trivial >> change. >> Rename both to float_to_float for consistency. >> >> Signed-off-by: James Almer <jamr...@gmail.com> >> --- >> libswresample/x86/audio_convert.asm | 14 ++++++++------ >> libswresample/x86/audio_convert_init.c | 11 +++++++---- >> 2 files changed, 15 insertions(+), 10 deletions(-) >> >> diff --git a/libswresample/x86/audio_convert.asm >> b/libswresample/x86/audio_convert.asm >> index 1617e0b..c13c26f 100644 >> --- a/libswresample/x86/audio_convert.asm >> +++ b/libswresample/x86/audio_convert.asm >> @@ -60,8 +60,8 @@ pack_2ch_%2_to_%1_u_int %+ SUFFIX >> punpcklwd m0, m2 >> punpckhwd m1, m2 >> %else >> - punpckldq m0, m2 >> - punpckhdq m1, m2 >> + unpcklps m0, m2 >> + unpckhps m1, m2 >> %endif >> %6 m0,m1,m2,m3,m4,m5 >> %else > > did you benchmark this ? > ive just checked and on Pentium M, Core Solo and Core Duo these are > listed as having only 1/5 the throughput > on sandybridge they are still listed with half the throughput than > their integer counterparts > i didnt benchmark it though
No, i didn't benchmark. And you're right, even on recent CPUs they seem to have half the throughput as the integer counterparts. Do you think it will mean a considerable performance hit? These functions aren't even that important in audio processing anyway (perf shows they represent less than 1% of total cpu time when doing pcm -> pcm). Nonetheless, considering this maybe the other functions should be changed to not use SBUTTERFLYPS. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel