Re: [FFmpeg-devel] [PATCHv2] lavu/x86/lls: add fma3 optimizations for update_lls

2016-01-14 Thread Ganesh Ajjanagadde
On Thu, Jan 14, 2016 at 7:23 PM, James Almer wrote: > On 1/14/2016 9:06 PM, Ganesh Ajjanagadde wrote: >> On Thu, Jan 14, 2016 at 6:54 PM, Ganesh Ajjanagadde wrote: >>> On Thu, Jan 14, 2016 at 6:42 PM, James Almer wrote: On 1/14/2016 7:46 PM, Ganesh Ajjanagadde wrote: > This improves acc

Re: [FFmpeg-devel] [PATCHv2] lavu/x86/lls: add fma3 optimizations for update_lls

2016-01-14 Thread James Almer
On 1/14/2016 9:06 PM, Ganesh Ajjanagadde wrote: > On Thu, Jan 14, 2016 at 6:54 PM, Ganesh Ajjanagadde wrote: >> On Thu, Jan 14, 2016 at 6:42 PM, James Almer wrote: >>> On 1/14/2016 7:46 PM, Ganesh Ajjanagadde wrote: This improves accuracy (very slightly) and speed for processors having

Re: [FFmpeg-devel] [PATCHv2] lavu/x86/lls: add fma3 optimizations for update_lls

2016-01-14 Thread Ganesh Ajjanagadde
On Thu, Jan 14, 2016 at 6:54 PM, Ganesh Ajjanagadde wrote: > On Thu, Jan 14, 2016 at 6:42 PM, James Almer wrote: >> On 1/14/2016 7:46 PM, Ganesh Ajjanagadde wrote: >>> This improves accuracy (very slightly) and speed for processors having >>> fma3. >>> >>> Sample benchmark (fate flac-16-lpc-chole

Re: [FFmpeg-devel] [PATCHv2] lavu/x86/lls: add fma3 optimizations for update_lls

2016-01-14 Thread Ganesh Ajjanagadde
On Thu, Jan 14, 2016 at 6:42 PM, James Almer wrote: > On 1/14/2016 7:46 PM, Ganesh Ajjanagadde wrote: >> This improves accuracy (very slightly) and speed for processors having >> fma3. >> >> Sample benchmark (fate flac-16-lpc-cholesky, Haswell): >> old: >> 5993610 decicycles in ff_lpc_calc_coefs,

Re: [FFmpeg-devel] [PATCHv2] lavu/x86/lls: add fma3 optimizations for update_lls

2016-01-14 Thread James Almer
On 1/14/2016 7:46 PM, Ganesh Ajjanagadde wrote: > This improves accuracy (very slightly) and speed for processors having > fma3. > > Sample benchmark (fate flac-16-lpc-cholesky, Haswell): > old: > 5993610 decicycles in ff_lpc_calc_coefs, 64 runs, 0 skips > 5951528 decicycles in ff_lpc_ca

[FFmpeg-devel] [PATCHv2] lavu/x86/lls: add fma3 optimizations for update_lls

2016-01-14 Thread Ganesh Ajjanagadde
This improves accuracy (very slightly) and speed for processors having fma3. Sample benchmark (fate flac-16-lpc-cholesky, Haswell): old: 5993610 decicycles in ff_lpc_calc_coefs, 64 runs, 0 skips 5951528 decicycles in ff_lpc_calc_coefs, 128 runs, 0 skips new: 5252410 decicycles