subject:"\[flac\-dev\] PATCH\: asm versions for two _wide\(\) functions"

Re: [flac-dev] PATCH: asm versions for two _wide() functions

2014-01-07 Thread Erik de Castro Lopo

lvqcl wrote: > Erik de Castro Lopo wrote: > > > I'l do a little more testing on this and the other patches before pushing > > to git. > > According to my tests, the speed increase after the patch that changes > "call .get_eip0 / pop eax" to "call .mov_eip_to_eax / mov eax, [esp] / ret" > is negl

Re: [flac-dev] PATCH: asm versions for two _wide() functions

2014-01-07 Thread lvqcl

Erik de Castro Lopo wrote: > I'l do a little more testing on this and the other patches before pushing > to git. According to my tests, the speed increase after the patch that changes "call .get_eip0 / pop eax" to "call .mov_eip_to_eax / mov eax, [esp] / ret" is negligible or absent. OTOH, libFL

Re: [flac-dev] PATCH: asm versions for two _wide() functions

2014-01-07 Thread Erik de Castro Lopo

lvqcl wrote: > As I wrote earlier, GCC generates slow ia32 code for > FLAC__lpc_compute_residual_from_qlp_coefficients_wide() > and FLAC__lpc_restore_signal_wide(). So 24-bit encoding/decoding is slower > for GCC compile than for MSVS or ICC compile. > > I took FLAC__lpc_compute_residual_from_ql

[flac-dev] PATCH: asm versions for two _wide() functions

2014-01-03 Thread lvqcl

As I wrote earlier, GCC generates slow ia32 code for FLAC__lpc_compute_residual_from_qlp_coefficients_wide() and FLAC__lpc_restore_signal_wide(). So 24-bit encoding/decoding is slower for GCC compile than for MSVS or ICC compile. I took FLAC__lpc_compute_residual_from_qlp_coefficients_asm_ia32 a