yuv420p10 input pixel format support

Michael Niedermayer Sun, 22 Sep 2019 10:00:16 -0700

On Sat, Sep 21, 2019 at 10:49:21PM -0400, Devin Heitmueller wrote:
> 
> > On Sep 21, 2019, at 4:44 PM, Michael Niedermayer <mich...@niedermayer.cc> 
> > wrote:
> > 
> >> The patch just expands 4:2:0 to 4:2:2 while properly supporting interlaced 
> >> chroma.  
> > 
> > 4:2:0 and 4:2:2 have a chroma plane with different resolution.
> > converting between planes of different resolution is what i called scaling.
> > 
> > 
> >> It avoids having to auto insert the swscale filter in the case where there 
> >> is no scaling required (e.g. H.264 4:2:0 video being output to decklink in 
> >> its original resolution).
> > 
> > yes, doing an operation in the encoder avoids a filter being inserted which
> > does that operation.
> > Thats true for every encoder and every filter.
> 
> The key thing here is the encoder is already touching every pixel, so 
> avoiding having the need for the filter essentially allows the conversion to 
> happen at essentially zero cost (as we repack the pixels into the requisite 
> v210 layout).
> 
> > Also replacing interpolation by a nearest neighbor implementation
> > is quite expectedly faster.
> 
> Yes, and we can certainly argue about whether doing interpolation of chroma 
> when doing 4:2:0 to 4:2:2 actually has any visible benefit.  I can however 
> say the cost of having swscaler in the pipeline is considerable.  In fact I 
> didn’t appreciate it myself until I was trying to deliver 1080p60 in realtime 
> to four decklink outputs and couldn’t keep up on my target platform.  And 
> because filters generally aren’t threaded, I got hit with one of those cases 
> where I had to break out the profiler and ask “why on Earth is the main 
> ffmpeg thread so busy?"
> 
> 
> > one problem is
> > the user can setup the scale filter with high quality in mind or with 
> > low quality and speed in mind.
> > But after this patch she always gets low quality because the low quality
> > convertion code is hardcoded in the encoder which pretends to support 420.
> > The outside code has no chance to know it shouldnt feed 420 if high quality
> > is wanted.
>


> The user can still insert a scaler explicitly or use the pix_fmt argument so 
> the format filter gets put into the pipeline.

The problem is the user first has to know about the "in encoder convert" and
fully understand the implications. Only then can she insert a scaler and 
format filter manually (unless she is copy and pasting this from somewhere).
That is alot of knowledge for the average user


> 
> > 
> > Also why should this be in one encoder and not be available to other
> > encoders supporting 4:2:2 input ?
> > A solution should work for all of them
> 
> I would assume this would really only be helpful in encoders which only 
> support 4:2:2 and not 4:2:0, since typical encoders that accept 4:2:0 would 
> preserve that in their resulting encoding (i.e. they wouldn’t blindly upscale 
> 4:2:0 to 4:2:2 for no good reason).

Maybe. But then there are specifications that demand 4:2:2 in cases where the
underlaying encoder supports 4:2:0. In these cases 4:2:2 would need to be
delivered and delivering 4:2:0 directly to the encoder would not produce
a compliant result (with current encoders)


> 
> I did actually consider doing a separate filter which just does packed/planer 
> conversion and 4:2:0 to 4:2:2 (as opposed to swscaler).  In this case though 
> the additional modularity in such a filter was outweighed by my goal to 
> minimize the number of times I’m copying the frame data.  Combining it with 
> the v210 encoding meant only a single pass over the data.
> 
> > 
> > Iam not sure what is the best solution but simply hardcoding this in
> > one encoder feels rather wrong
> 
> The scale filter performs three basic roles:
> 1.  Scaling
> 2.  Packed to planer conversion (or vice versa)
> 3.  Colorspace conversion
> 

> I supposed potentially someone could redesign swscale to include the option 
> to not take the slow path for cases where scaling isn’t actually required 
> (i.e. cases where only 2 and 3 are needed).

For this look at ff_get_unscaled_swscale() a new 420->422 scaler can easily be
added in there, it would not be 0 copy though. But it certainly would
make sense for the cases where this codepath i used


> 
> Just so we’re all on the same page - this wasn’t a case of random or 
> premature optimization.  I have a specific use case where I’m decoding four 
> instances of 1080p60 video and the platform can’t keep up without this 
> change.  It’s the result of actually profiling the entire pipeline as opposed 
> to some unit test with a benchmark.  In fact I don’t particularly agree with 
> Limin's numbers where he used the benchmark option, since that fails to take 
> into account caching behavior or memory bandwidth on a platform which is 
> constrained (a problem which is exacerbated when running multiple instances). 
>  In a perfect world we would have very small operations which each perform 
> some discrete function, and we can combine all of those in a pipeline.  In 
> the real world though, significant benefits can be gained by combining 
> certain operations to avoid copying the same pixels over and over again.

I certainly see the benefit this 0copy422to420 has. And iam not
against such an optimization. But it needs to be done in a
cleaner way. Iam not sure how to do that ATM.

There possibly could be a function which produces a list of line pointers
so that for a image for each plane and line a pointer is produced.
and encoders which use this would have a capability flag set.

This would then mean that a encoder supporting 422 input
could be fed with 420 and a encoder supporting 420 could be
fed with 422 converted by just different values in the pointer list.

Also this feature would be known from the outside by the flag and
be controlled by the outside. such a encoder would never claim support of
other pixel formats

Thanks

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

If you fake or manipulate statistics in a paper in physics you will never
get a job again.
If you fake or manipulate statistics in a paper in medicin you will get
a job for life at the pharma industry.

signature.asc
Description: PGP signature

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v1] avcodec/v210enc: add yuv420p/yuv420p10 input pixel format support

Reply via email to