Re: [FFmpeg-devel] [PATCH 5/6] x86: lossless audio: SSE4 madd 32bits

2016-04-19 Thread Christophe Gisquet
Hi, 2016-04-20 2:01 GMT+02:00 Ronald S. Bultje : > This is typically only an issue if the data came from stack. On win64 as > well as unix64, the 4th argument never comes from stack but is a direct > register argument instead. So no benefit except consistency. I don't mind either way, though. On

Re: [FFmpeg-devel] [PATCH 5/6] x86: lossless audio: SSE4 madd 32bits

2016-04-19 Thread Ronald S. Bultje
Hi, On Tue, Apr 19, 2016 at 4:42 PM, James Almer wrote: > On 4/18/2016 6:25 PM, Christophe Gisquet wrote: > > 2016-04-18 21:18 GMT+02:00 Michael Niedermayer : > >> > this breaks (only noise) > >> > \[CCCP\]_Mega_Weird_Audio_Test.mkv track 23 > > Worthwhile sample. > > > > I rewrote the patch to

Re: [FFmpeg-devel] [PATCH 5/6] x86: lossless audio: SSE4 madd 32bits

2016-04-19 Thread James Almer
On 4/18/2016 6:25 PM, Christophe Gisquet wrote: > 2016-04-18 21:18 GMT+02:00 Michael Niedermayer : >> > this breaks (only noise) >> > \[CCCP\]_Mega_Weird_Audio_Test.mkv track 23 > Worthwhile sample. > > I rewrote the patch to reduce code duplication, and I fixed the issue > (misread a shift). > >

Re: [FFmpeg-devel] [PATCH 5/6] x86: lossless audio: SSE4 madd 32bits

2016-04-19 Thread Michael Niedermayer
On Mon, Apr 18, 2016 at 11:25:30PM +0200, Christophe Gisquet wrote: > 2016-04-18 21:18 GMT+02:00 Michael Niedermayer : > > this breaks (only noise) > > \[CCCP\]_Mega_Weird_Audio_Test.mkv track 23 > > Worthwhile sample. > > I rewrote the patch to reduce code duplication, and I fixed the issue > (m

Re: [FFmpeg-devel] [PATCH 5/6] x86: lossless audio: SSE4 madd 32bits

2016-04-18 Thread Christophe Gisquet
2016-04-18 21:18 GMT+02:00 Michael Niedermayer : > this breaks (only noise) > \[CCCP\]_Mega_Weird_Audio_Test.mkv track 23 Worthwhile sample. I rewrote the patch to reduce code duplication, and I fixed the issue (misread a shift). -- Christophe From a0d4a96c032d73bc0e34fec320497aefafba3c28 Mon S

Re: [FFmpeg-devel] [PATCH 5/6] x86: lossless audio: SSE4 madd 32bits

2016-04-18 Thread Michael Niedermayer
On Mon, Apr 18, 2016 at 03:07:30PM +0200, Christophe Gisquet wrote: > The unique user so far is wmalossless 24bits. The few samples tested show an > order of 8, so more unrolling or an avx2 version do not make sense. > > Timings: 72 -> 49 cycles > --- > libavcodec/x86/lossless_audiodsp.asm| 3

[FFmpeg-devel] [PATCH 5/6] x86: lossless audio: SSE4 madd 32bits

2016-04-18 Thread Christophe Gisquet
The unique user so far is wmalossless 24bits. The few samples tested show an order of 8, so more unrolling or an avx2 version do not make sense. Timings: 72 -> 49 cycles --- libavcodec/x86/lossless_audiodsp.asm| 38 + libavcodec/x86/lossless_audiodsp_init.c |