Re: [FFmpeg-devel] [PATCH V1] lavfi/normalize: improve the performance

Reimar Döffinger Fri, 21 Jun 2019 01:39:40 -0700

On Thu, Jun 20, 2019 at 09:43:46PM +0800, Jun Zhao wrote:
> From: Jun Zhao <[email protected]>
>
> Remove unnecessary max value found, it's will improve the performance
> about 10%. Used the test command like:
> ffmpeg -i 1080P.mp4 -an -vf normalize -f null /dev/null, the FPS change
> from 96fps to 107fps.
>
> Signed-off-by: Jun Zhao <[email protected]>
> ---
>  libavfilter/vf_normalize.c |    7 +++----
>  1 files changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/libavfilter/vf_normalize.c b/libavfilter/vf_normalize.c
> index 48eea59..40dc031 100644
> --- a/libavfilter/vf_normalize.c
> +++ b/libavfilter/vf_normalize.c
> @@ -143,14 +143,13 @@ static void normalize(NormalizeContext *s, AVFrame *in, 
> AVFrame *out)
>          min[c].in = max[c].in = in->data[0][s->co[c]];
>      for (y = 0; y < in->height; y++) {
>          uint8_t *inp = in->data[0] + y * in->linesize[0];
> -        uint8_t *outp = out->data[0] + y * out->linesize[0];
>          for (x = 0; x < in->width; x++) {
>              for (c = 0; c < 3; c++) {
> -                min[c].in = FFMIN(min[c].in, inp[s->co[c]]);
> -                max[c].in = FFMAX(max[c].in, inp[s->co[c]]);
> +                uint8_t val = inp[s->co[c]];
> +                if (val < min[c].in)      min[c].in = val;
> +                else if (val > max[c].in) max[c].in = val;
>              }
>              inp += s->step;
> -            outp += s->step;


outp should just be removed, a patch for that certainly would be
accepted.
But I am very skeptical towards the rest of the change.
In a way I would expect the compiler should be able to
optimize this.
However this code is clearly completely unoptimized, and
thus I don't think making it more complex for a minor optimization
that gives just 10% is worth it.
A proper optimization would get rid of the s->co indirection in the
inner loop - it really is not necessary for the limited format
support the code has.
This would then also allow for SIMD optimization, which should give
a vastly larger speedup - and possibly the compiler would even be able
to auto-vectorize.
This applies even more to the processing step - the transform
calculation done in SIMD is almost certain to be cheaper than
a LUT lookup - at least if switch to fixed-point arithmetic
instead of float.
Side note: I haven't really figured out what the point
of all this s->co complexity is supposed to be.
The only thing the code seems to actually care is whether
there is an alpha channel, and if so where it is.
But the quickest path to a significant speedup would
be to write a SIMD version of the normalize function specific
to that format and the SIMD instruction set available
and call it instead of normalize if applicable.
I would expect that you should see at least a 3x speedup
if using e.g. SSE2 (which is why I don't consider a
10% speedup worth changing the code for really).

Best regards,
Reimar Döffinger
_______________________________________________
ffmpeg-devel mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V1] lavfi/normalize: improve the performance

Reply via email to