Jul 15, 2023, 10:26 by r...@remlab.net: > Le lauantaina 15. heinäkuuta 2023, 11.05.51 EEST Lynne a écrit : > >> Jul 14, 2023, 20:29 by r...@remlab.net: >> > This makes all calls to the bench start and stop functions via >> > function pointers. While the primary goal is to support run-time >> > selection of the performance measurement back-end in later commits, >> > this has the side benefit of containing platform dependencies in to >> > checkasm.c and out of checkasm.h. >> > --- >> > >> > tests/checkasm/checkasm.c | 33 ++++++++++++++++++++++++++++----- >> > tests/checkasm/checkasm.h | 31 ++++--------------------------- >> > 2 files changed, 32 insertions(+), 32 deletions(-) >> >> Not sure I agree with this commit, the overhead can be detectable, >> and we have a lot of small functions with runtime a few times that >> of a null function call. >> > > I don't think the function call is ever null. The pointers are left NULL only > if none of the backend initialise. But then, checkasm will bail out and exit > before we try to benchmark anything anyway. > > As for the real functions, they always do *something*. None of them "just > return 0". >
I meant a no-op function call to measure the overhead of function calls themselves, complete with all the ABI stuff. >> Can you store the function pointers out of the loop to reduce >> the derefs needed? >> > > Taking just the two loads is out of the loop should be feasible but it seems > a > rather vain. You will still have the overhead of the indirect function call, > the function, and most importantly in the case of Linux perf and MacOS kperf, > the system calls. > > The only way to avoid the indirect function calls are to use IFUNC (tricky > and > not portable), or to make horrible macros to spawn one bench loop for each > backend. > > In the end, I think we should rather aim for as constant time as possible, > rather than as fast as possible, so that the nop loop can estimate the > benchmarking overhead as well as possible. In this respect, I think it is > actually marginally better *not* to cache the function pointers in local > variables, which could end up spilled on the stack, or not, depending on > local > compiler optimisations for any given test case. > I disagree, uninlining the timer fetches adds another source of inconsistency. It may be messy, but I think accuracy here is more important than cleanliness, especially as it's a development tool. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".