Le maanantaina 27. toukokuuta 2024, 22.15.40 EEST James Almer a écrit : > On 5/27/2024 4:10 PM, James Almer wrote: > > On 5/27/2024 1:01 PM, Rémi Denis-Courmont wrote: > >> --- > >> Changes since v2: > >> - Scale the error factor to length since this computes sums. > >> - Check the last element from results. > >> - Use fixed vector size for benchmarks. > >> > >> --- > >> tests/checkasm/lpc.c | 51 +++++++++++++++++++++++++++++++++++++++++--- > >> 1 file changed, 48 insertions(+), 3 deletions(-) > > > > checkasm: using random seed 883526087 > > checkasm: bench runs 1024 (1 << 10) > > > > SSE2: > > - lpc.apply_welch_window_even [OK] > > - lpc.apply_welch_window_odd [OK] > > > > 8: 666.011902576448 - 665.600444506565 = 0.411458069884 > > > > autocorr_8_sse2 (lpc.c:88) > > - lpc.compute_autocorr [FAILED] > > The following fixes it: > > diff --git a/libavcodec/x86/lpc_init.c b/libavcodec/x86/lpc_init.c > > index f2fca53799..9f41639feb 100644 > > --- a/libavcodec/x86/lpc_init.c > > +++ b/libavcodec/x86/lpc_init.c > > @@ -99,6 +99,15 @@ static void lpc_compute_autocorr_sse2(const double > > *data, ptrdiff_t len, int lag> > > ); > > > > } > > > > } > > > > + > > + if(j==lag){ > > + double sum = 1.0; > > + for(int i=j-1; i<len; i+=2){ > > + sum += data[i ] * data[i-j ] > > + + data[i+1] * data[i-j+1]; > > + } > > + autoc[j] = sum; > > + } > > > > } > > > > #endif /* HAVE_SSE2_INLINE */ > > So the SSE2 version is effectively broken, and ideally should be ported > to nasm as it's fixed.
I also have my doubts about the C version. The `i += 2` looks a bit suspicious on a tail case. -- レミ・デニ-クールモン http://www.remlab.net/ _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".