On 5/27/2024 4:15 PM, James Almer wrote:
On 5/27/2024 4:10 PM, James Almer wrote:
On 5/27/2024 1:01 PM, Rémi Denis-Courmont wrote:
---
Changes since v2:
- Scale the error factor to length since this computes sums.
- Check the last element from results.
- Use fixed vector size for benchmarks.
---
tests/checkasm/lpc.c | 51 +++++++++++++++++++++++++++++++++++++++++---
1 file changed, 48 insertions(+), 3 deletions(-)
checkasm: using random seed 883526087
checkasm: bench runs 1024 (1 << 10)
SSE2:
- lpc.apply_welch_window_even [OK]
- lpc.apply_welch_window_odd [OK]
8: 666.011902576448 - 665.600444506565 = 0.411458069884
autocorr_8_sse2 (lpc.c:88)
- lpc.compute_autocorr [FAILED]
The following fixes it:
diff --git a/libavcodec/x86/lpc_init.c b/libavcodec/x86/lpc_init.c
index f2fca53799..9f41639feb 100644
--- a/libavcodec/x86/lpc_init.c
+++ b/libavcodec/x86/lpc_init.c
@@ -99,6 +99,15 @@ static void lpc_compute_autocorr_sse2(const double
*data, ptrdiff_t len, int lag
);
}
}
+
+ if(j==lag){
+ double sum = 1.0;
+ for(int i=j-1; i<len; i+=2){
+ sum += data[i ] * data[i-j ]
+ + data[i+1] * data[i-j+1];
+ }
+ autoc[j] = sum;
+ }
}
#endif /* HAVE_SSE2_INLINE */
So the SSE2 version is effectively broken, and ideally should be ported
to nasm as it's fixed.
Actually, that only fixes setting the last value. There are still
failures in random places using several different seeds.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".