Michael Niedermayer: > On Fri, Sep 16, 2022 at 04:55:39PM +0200, Andreas Rheinhardt wrote: >> Up until now, libswscale/output.c used a macro to write >> an output pixel which involved a call to av_pix_fmt_desc_get() >> to find out whether the input pixel format is BE or LE >> despite this being known at compile-time (there are templates >> per pixfmt). Even worse, these calls are made in a loop, >> so that e.g. there are eight calls to av_pix_fmt_desc_get() >> for every pixel processed in yuv2rgba64_X_c_template() >> for 64bit RGB formats. >> >> This commit modifies these macros to ensure that isBE() >> is evaluated at compile-time. This saved 41184B of .text >> for me (GCC 11.2, -O3). Of course, it also improved performance. >> E.g. ffmpeg_g -f lavfi -i testsrc2,format=yuva420p -pix_fmt rgba64le \ >> -threads 1 -t 1:00 -f null - (which uses yuv2rgba64le_X_c, >> which is an invocation of yuv2rgba64_X_c_template() mentioned above), >> performance improved from 95589 to 41387 decicycles for one call >> to yuv2packedX; for the be variant the numbers went down from >> 76087 to 43024 decicycles. >> >> Signed-off-by: Andreas Rheinhardt <andreas.rheinha...@outlook.com> >> --- >> libswscale/output.c | 100 +++++++++++++++++++++++++------------------- >> 1 file changed, 58 insertions(+), 42 deletions(-) > > This looks alot better than before > > thx > > PS: i still think that broader support for compile time evaluation of > "pure" functions would be usefull. Ideally with minimal mess on the source > side, more on the build tool side >
I agree with that. Hopefully we find a solution. - Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".