Michael Niedermayer:
> On Thu, Sep 08, 2022 at 09:38:51PM +0200, Andreas Rheinhardt wrote:
>> Michael Niedermayer:
>>> Hi
>>>
>>> On Thu, Sep 08, 2022 at 04:38:11AM +0200, Andreas Rheinhardt wrote:
>>>> Up until now, libswscale/input.c used a macro to read
>>>> an input pixel which involved a call to av_pix_fmt_desc_get()
>>>> to find out whether the input pixel format is BE or LE
>>>> despite this being known at compile-time (there are templates
>>>> per pixfmt). Even worse, these calls are made in a loop,
>>>> so that e.g. there are six calls to av_pix_fmt_desc_get()
>>>> for every pair of UV pixel processed in
>>>> rgb64ToUV_half_c_template().
>>>>
>>>> This commit modifies these macros to ensure that isBE()
>>>> is evaluated at compile-time. This saved 9743B of .text
>>>> for me (GCC 11.2, -O3).
>>>
>>> hmm, all these functions where supposed to be optimized out
>>> why where they not ?
>>>
>>> iam asking as the code is simpler before your patch if that
>>> "optimization out" thing would work
>>>
>>
>> Why should these functions be optimized out? What would enable the
>> compiler to optimize them out?
> 
> Going back into the past, there was
> 6b0768e2021b90215a2ab55ed427bce91d148148
> 
> before this the code certainly did get optimized out, it was just
> #define isBE(x) ((x)&1)
> 
> thats simple and clean code btw

I don't really consider such magic numbers to be clean.

> after this it became
> 
> #define isBE(x) \
> +    (av_pix_fmt_descriptors[x].flags & PIX_FMT_BE)
> 
> thats still really good, and very readable, its a const array so
> one would assume that a compiler can figure that out at compile time
> well, i try not to think of linking and seperate objects here ;)
> 
> next it got then replaced by a function and a call that i suspect
> people thought would be inlined
> 
> 
>> (And I really don't see why this patch would make the code more
>> complicated.)
> 
> the code historically was capable to lookup any flag and detail
> of a pixel format at compile time
> now your code works around that not working. Introducing a 2nd
> system to do this in parallel. 

I am not introducing a second system, I am reusing the existing system,
namely our existing naming system (the fact that we use BE/LE in the
name of BE/LE pixel formats).

> To me if i look at the evolution
> of isBE() / code checking BE-ness it become more messy over time
> 
> I think it would be interresting to think about if we can make
> av_pix_fmt_desc_get(compile time constant) work at compile time.
> or if we maybe can return to a simpler implementation
> 

We could put the av_pix_fmt_descriptors array into an internal header
and use something like

static av_always_inline const AVPixFmtDescriptor
*ff_pix_fmt_descriptor_get(enum AVPixelFormat fmt)
{
    if (av_builtin_constant_p(fmt))
        return &av_pix_fmt_descriptors[fmt];
    return av_pix_fmt_desc_get(fmt);
}

- Andreas
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to