On Fri, Jun 13, 2025 at 2:08 AM Martin Storsjö <mar...@martin.st> wrote: > > On Fri, 13 Jun 2025, Tristan Matthews wrote: > > > On Thu, Jun 12, 2025 at 4:14 PM Martin Storsjö <mar...@martin.st> wrote: > >> > >> On Thu, 12 Jun 2025, Tristan Matthews wrote: > >> > >>> --- > >>> tests/checkasm/h264dsp.c | 37 +++++++++++++++++++++++++++++++++++++ > >>> 1 file changed, 37 insertions(+) > >>> > >>> diff --git a/tests/checkasm/h264dsp.c b/tests/checkasm/h264dsp.c > >>> index d1228ed985..5fba31cf69 100644 > >>> --- a/tests/checkasm/h264dsp.c > >>> +++ b/tests/checkasm/h264dsp.c > >>> @@ -22,6 +22,7 @@ > >>> #include "checkasm.h" > >>> #include "libavcodec/h264dsp.h" > >>> #include "libavcodec/h264data.h" > >>> +#include "libavcodec/h264idct.h" > >>> #include "libavcodec/h264_parse.h" > >>> #include "libavutil/common.h" > >>> #include "libavutil/intreadwrite.h" > >>> @@ -324,6 +325,41 @@ static void check_idct_multiple(void) > >>> } > >>> } > >>> > >>> +static void check_idct_dequant(void) > >>> +{ > >>> + static const int depths[5] = { 8, 9, 10, 12, 14 }; > >>> + LOCAL_ALIGNED_16(int16_t, src, [16]); > >>> + LOCAL_ALIGNED_16(int16_t, dst0, [16 * 16]); > >>> + LOCAL_ALIGNED_16(int16_t, dst1, [16 * 16]); > >>> + H264DSPContext h; > >>> + int bit_depth, i, qmul; > >>> + declare_func_emms(AV_CPU_FLAG_MMX | AV_CPU_FLAG_SSE2, void, int16_t > >>> *output, int16_t *input, int qmul); > >>> + > >>> + for (int j = 0; j < 16; j++) > >>> + src[j] = (rnd() % 512) - 256; > >>> + > >>> + qmul = rnd() % 4096; > >>> + > >>> + memset(dst0, 0, 16 * 16 * sizeof(dst0[0])); > >>> + memset(dst1, 0, 16 * 16 * sizeof(dst1[0])); > >>> + > >>> + for (i = 0; i < FF_ARRAY_ELEMS(depths); i++) { > >>> + bit_depth = depths[i]; > >>> + ff_h264dsp_init(&h, bit_depth, 1); > >>> + > >>> + if (check_func(h.h264_luma_dc_dequant_idct, > >>> "h264_luma_dc_dequant_idct_%d", bit_depth)) { > >>> + > >>> + call_ref(dst0, src, qmul); > >>> + call_new(dst1, src, qmul); > >>> + > >>> + if (memcmp(dst0, dst1, 16 * 16 * sizeof(*dst0))) > >>> + fail(); > >> > >> If possible, use the checkasm_check_*() helpers for validation for new > >> code; this gives you printout of the differing values if you run "checkasm > >> -v" and more. In this case, I think checkasm_check(int16_t, dst0, > >> 16*sizeof(int16_t), dst1, 16*sizeof(int16_t), 16, 16, "dst") would be > >> suitable one. > > > > Good catch, also I realized that the output buffers were too small, > > will be fixed in the next version. > > Why was that too small? If we write (and check) 16x16 int16_t elements, > the previous allocation of LOCAL_ALIGNED_16(int16_t, dst0, [16 * 16]) > sounds just right? Or does the function use the [16*16,2*16*16) area of > the destination as scratch space?
That's what I thought too until I noticed the FATE failures (e.g. https://patchwork.ffmpeg.org/check/124147/), and on further digging realized that dctcoef (used for dst here: https://git.ffmpeg.org/gitweb/ffmpeg.git/blob/fb65ecbc9b805571e5ff707b935c343803137e54:/libavcodec/h264idct_template.c#l256 ) will be either 2 or 4 bytes depending on bit-depth IIUC (see https://git.ffmpeg.org/gitweb/ffmpeg.git/blob/fb65ecbc9b805571e5ff707b935c343803137e54:/libavcodec/bit_depth_template.c#l54 ) Best, Tristan _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".