Thanks for feedback! On Thu, Nov 14, 2024 at 9:40 PM Ronald S. Bultje <rsbul...@gmail.com> wrote:
> In terms of binary size, the above will triplicate nearly identical code in > the binary, and that doesn't really seem necessary. You should be able to > write the above out as a function and call it with appropriate xblen and > func_index arguments (check_add_obmc(0, 8); check_add_obmc(1, 16); > check_add_obmc(2, 32)) without duplicating these in the binary. Done for many > tests, this reduces binary size and compile time significantly. Sounds good. I'm going to implement these since I have some free time now. > You can also get bonus points if you randomize yblen, since the bitstream > allows multiple values for this (at least 4, 12, 16, 24, but possibly any > number), so maybe 1 + (random % 64) would be a good choice? I'll try this one too. Thanks. Kyosuke On Thu, Nov 14, 2024 at 9:40 PM Ronald S. Bultje <rsbul...@gmail.com> wrote: > > Hi, > > thanks for adding the test! This looks pretty good. Minor suggestion: > > On Wed, Nov 13, 2024 at 5:39 PM Kyosuke Kawakami <kawakami150...@gmail.com> > wrote: >> >> +#define RANDOMIZE_DESTS(name, size) \ >> + do { \ >> + int i; \ >> + for (i = 0; i < size; ++i) { \ >> + uint16_t r = rnd(); \ >> + AV_WN16A(name##0 + i, r); \ >> + AV_WN16A(name##1 + i, r); \ >> + } \ >> + } while (0) >> + >> +#define RANDOMIZE_BUFFER8(name, size) \ >> + do { \ >> + int i; \ >> + for (i = 0; i < size; ++i) { \ >> + uint8_t r = rnd(); \ >> + name[i] = r; \ >> + } \ >> + } while (0) >> + >> +#define OBMC_STRIDE 32 >> + >> +#define CHECK_ADD_OBMC(func_index, yblen, xblen) >> \ >> + static void check_add_obmc ## xblen(void) >> \ >> + { >> \ >> + LOCAL_ALIGNED_8(uint8_t, src, [yblen * xblen]); >> \ >> + LOCAL_ALIGNED_16(uint16_t, dst0, [yblen * xblen]); >> \ >> + LOCAL_ALIGNED_16(uint16_t, dst1, [yblen * xblen]); >> \ >> + LOCAL_ALIGNED_8(uint8_t, obmc_weight, [yblen * OBMC_STRIDE]); >> \ >> + DiracDSPContext h; >> \ >> + ff_diracdsp_init(&h); >> \ >> + if (check_func(h.add_dirac_obmc[func_index], >> "diracdsp.add_dirac_obmc_%d", xblen)) { \ >> + declare_func(void, uint16_t*, const uint8_t*, int, const >> uint8_t *, int); \ >> + RANDOMIZE_BUFFER8(src, yblen * xblen); >> \ >> + RANDOMIZE_DESTS(dst, yblen * xblen); >> \ >> + RANDOMIZE_BUFFER8(obmc_weight, yblen * OBMC_STRIDE); >> \ >> + call_ref(dst0, src, xblen, obmc_weight, yblen); >> \ >> + call_new(dst1, src, xblen, obmc_weight, yblen); >> \ >> + if (memcmp(dst0, dst1, yblen * xblen)) >> \ >> + fail(); >> \ >> + bench_new(dst1, src, xblen, obmc_weight, yblen); >> \ >> + } >> \ >> + } >> \ >> + >> +CHECK_ADD_OBMC(0, 64, 8) >> +CHECK_ADD_OBMC(1, 64, 16) >> +CHECK_ADD_OBMC(2, 64, 32) >> + >> +void checkasm_check_diracdsp(void) >> +{ >> + check_add_obmc8(); >> + check_add_obmc16(); >> + check_add_obmc32(); >> + report("diracdsp"); >> +} > > > In terms of binary size, the above will triplicate nearly identical code in > the binary, and that doesn't really seem necessary. You should be able to > write the above out as a function and call it with appropriate xblen and > func_index arguments (check_add_obmc(0, 8); check_add_obmc(1, 16); > check_add_obmc(2, 32)) without duplicating these in the binary. Done for many > tests, this reduces binary size and compile time significantly. > > You can also get bonus points if you randomize yblen, since the bitstream > allows multiple values for this (at least 4, 12, 16, 24, but possibly any > number), so maybe 1 + (random % 64) would be a good choice? > > If you don't care for the above, we can merge this as-is, they're merely > suggestions, let me know what you think. > > Ronald _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".