Re: [FFmpeg-devel] [PATCH] vf_psnr: sse2 optimizations for sum-squared-error.

2015-07-14 Thread Michael Niedermayer
On Mon, Jul 13, 2015 at 05:14:09PM -0300, James Almer wrote: > On 13/07/15 5:07 PM, Ronald S. Bultje wrote: > > Hi, > > > > On Mon, Jul 13, 2015 at 3:50 PM, James Darnley > > wrote: > > > >> On 2015-07-13 01:34, Ronald S. Bultje wrote: > >>> Hi, > >>> > >>> On Sun, Jul 12, 2015 at 5:54 PM, Paul

Re: [FFmpeg-devel] [PATCH] vf_psnr: sse2 optimizations for sum-squared-error.

2015-07-13 Thread James Almer
On 13/07/15 5:07 PM, Ronald S. Bultje wrote: > Hi, > > On Mon, Jul 13, 2015 at 3:50 PM, James Darnley > wrote: > >> On 2015-07-13 01:34, Ronald S. Bultje wrote: >>> Hi, >>> >>> On Sun, Jul 12, 2015 at 5:54 PM, Paul B Mahol wrote: >>> On 7/12/15, Ronald S. Bultje wrote: > +typedef stru

Re: [FFmpeg-devel] [PATCH] vf_psnr: sse2 optimizations for sum-squared-error.

2015-07-13 Thread Ronald S. Bultje
Hi, On Mon, Jul 13, 2015 at 3:50 PM, James Darnley wrote: > On 2015-07-13 01:34, Ronald S. Bultje wrote: > > Hi, > > > > On Sun, Jul 12, 2015 at 5:54 PM, Paul B Mahol wrote: > > > >> On 7/12/15, Ronald S. Bultje wrote: > >>> +typedef struct PSNRDSPContext { > >>> +uint64_t (*sse_line)(cons

Re: [FFmpeg-devel] [PATCH] vf_psnr: sse2 optimizations for sum-squared-error.

2015-07-13 Thread James Darnley
On 2015-07-13 01:34, Ronald S. Bultje wrote: > Hi, > > On Sun, Jul 12, 2015 at 5:54 PM, Paul B Mahol wrote: > >> On 7/12/15, Ronald S. Bultje wrote: >>> +typedef struct PSNRDSPContext { >>> +uint64_t (*sse_line)(const uint8_t *buf, const uint8_t *ref, int w); >> >> Besides naming of functio

Re: [FFmpeg-devel] [PATCH] vf_psnr: sse2 optimizations for sum-squared-error.

2015-07-12 Thread Ronald S. Bultje
Hi, On Sun, Jul 12, 2015 at 5:54 PM, Paul B Mahol wrote: > On 7/12/15, Ronald S. Bultje wrote: > > The internal line accumulator for 16bit can overflow, so I changed that > > from int to uint64_t in the C code. The matching assembly looks a little > > weird but output looks correct. > > > > (av

Re: [FFmpeg-devel] [PATCH] vf_psnr: sse2 optimizations for sum-squared-error.

2015-07-12 Thread Paul B Mahol
On 7/12/15, Ronald S. Bultje wrote: > The internal line accumulator for 16bit can overflow, so I changed that > from int to uint64_t in the C code. The matching assembly looks a little > weird but output looks correct. > > (avx2 should be trivial to add later.) > --- > libavfilter/psnr.h

[FFmpeg-devel] [PATCH] vf_psnr: sse2 optimizations for sum-squared-error.

2015-07-12 Thread Ronald S. Bultje
The internal line accumulator for 16bit can overflow, so I changed that from int to uint64_t in the C code. The matching assembly looks a little weird but output looks correct. (avx2 should be trivial to add later.) --- libavfilter/psnr.h | 33 ++ libavfilter/vf_psnr.c

Re: [FFmpeg-devel] [PATCH] vf_psnr: sse2 optimizations for sum-squared-error.

2015-07-12 Thread Clément Bœsch
On Sat, Jul 11, 2015 at 10:50:26AM -0400, Ronald S. Bultje wrote: > The internal line accumulator for 16bit can overflow, so I changed that > from int to uint64_t in the C code. The matching assembly looks a little > weird but output looks correct. > It assumes aligned input pointers, I'm > not su

[FFmpeg-devel] [PATCH] vf_psnr: sse2 optimizations for sum-squared-error.

2015-07-11 Thread Ronald S. Bultje
The internal line accumulator for 16bit can overflow, so I changed that from int to uint64_t in the C code. The matching assembly looks a little weird but output looks correct. It assumes aligned input pointers, I'm not sure if that's a requirement in lavfi (it should be IMO). (avx2 should be triv

[FFmpeg-devel] [PATCH] vf_psnr: sse2 optimizations for sum-squared-error.

2015-07-11 Thread Ronald S. Bultje
The internal line accumulator for 16bit can overflow, so I changed that from int to uint64_t in the C code. The matching assembly looks a little weird but output looks correct. It assumes aligned input pointers, I'm not sure if that's a requirement in lavfi (it should be IMO). (avx2 should be triv