Re: [FFmpeg-devel] [PATCH 3/5] avutil/pixelutils: faster pixelutils_sad_[au]_16x16

2014-08-23 Thread Clément Bœsch
On Sun, Aug 17, 2014 at 01:51:13PM +0200, Michael Niedermayer wrote: > On Thu, Aug 14, 2014 at 11:05:13PM +0200, Clément Bœsch wrote: > > ~560 → ~500 decicycles > > > > This is following the comments from Michael in > > https://ffmpeg.org/pipermail/ffmpeg-devel/2014-August/160599.html > > > > Usi

Re: [FFmpeg-devel] [PATCH 3/5] avutil/pixelutils: faster pixelutils_sad_[au]_16x16

2014-08-17 Thread Michael Niedermayer
On Thu, Aug 14, 2014 at 11:05:13PM +0200, Clément Bœsch wrote: > ~560 → ~500 decicycles > > This is following the comments from Michael in > https://ffmpeg.org/pipermail/ffmpeg-devel/2014-August/160599.html > > Using 2 registers for accumulator didn't help. On the other hand, > some re-ordering b

[FFmpeg-devel] [PATCH 3/5] avutil/pixelutils: faster pixelutils_sad_[au]_16x16

2014-08-14 Thread Clément Bœsch
~560 → ~500 decicycles This is following the comments from Michael in https://ffmpeg.org/pipermail/ffmpeg-devel/2014-August/160599.html Using 2 registers for accumulator didn't help. On the other hand, some re-ordering between the movs and psadbw allowed going ~538 to ~500. --- libavutil/x86/pix