Re: [FFmpeg-user] bwdif filter question

Edward Park Tue, 22 Sep 2020 01:21:41 -0700

Hello,

>> I'm not entirely aware of what is being discussed, but progressive_frame = 
>> !interlaced_frame kind of sent me back a bit, I do remember the discrepancy 
>> you noted in some telecopied material, so I'll just quickly paraphrase from 
>> what we looked into before, hopefully it'll be relevant.
>> The AVFrame interlaced_frame flag isn't completely unrelated to mpeg 
>> progressive_frame, but it's not a simple inverse either, very 
>> context-dependent. With mpeg video, it seems it is an interlaced_frame if it 
>> is not progressive_frame ...
> 
> No so, Ted. The following two definitions are from the glossary I'm preparing 
> (and which cites H.262).


Ah okay I thought that was a bit weird, I assume it was a typo but I saw h.242 
and thought two different types of "frames" were being mixed. Before saying 
anything if the side project you mentioned was a layman’s glossary type 
reference material, I think you should base it off of the definitions section 
instead of the bitstream definitions, just my $.02. I read over what I wrote 
and I don't think it helps at all, let me try again, I am saying that there are 
the "frames" in the context of a container, and a different kind of video 
"frame" that has a width and height dimension. (When I wrote "picture frames" I 
meant to refer to physical wooden picture frames for photo prints, but with 
terms like frame pictures in play not very effective in hindsight)

> Since you capitalize "AVFrames", I assume that you cite a standard of some 
> sort. I'd very much like to see it. Do you have a link?

This was the main info I was trying to add, it's not a standard of any kind, 
quite the opposite, actually, since technically its declaration could be 
changed in a single commit, but I don't think that is a common occurrence. 
AVFrame is a struct that is used to abstract/implement all frames in the many 
different formats ffmpeg handles. it is noted that its size could change as 
fields are added to the struct.

There's documentation generated for it here: 
https://www.ffmpeg.org/doxygen/trunk/structAVFrame.html

> H.262 refers to "frame pictures" and "field pictures" without clearly 
> delineating them. I am calling them "pictures" and "halfpictures".

I thought ISO 13818-2 was basically the identical standard, and it gives pretty 
clear definitions imo, here are some excerpts. (Wall of text coming up… 
standards are very wordy by necessity)

> 6.1.1. Video sequence
> 
> The highest syntactic structure of the coded video bitstream is the video 
> sequence.
> 
> A video sequence commences with a sequence header which may optionally be 
> followed by a group of pictures header and then by one or more coded frames. 
> The order of the coded frames in the coded bitstream is the order in which 
> the decoder processes them, but not necessarily in the correct order for 
> display. The video sequence is terminated by a sequence_end_code. At various 
> points in the video sequence a particular coded frame may be preceded by 
> either a repeat sequence header or a group of pictures header or both. (In 
> the case that both a repeat sequence header and a group of pictures header 
> immediately precede a particular picture, the group of pictures header shall 
> follow the repeat sequence header.)
> 
> 6.1.1.1. Progressive and interlaced sequences
> This specification deals with coding of both progressive and interlaced 
> sequences.
> 
> The output of the decoding process, for interlaced sequences, consists of a 
> series of reconstructed fields that are separated in time by a field period. 
> The two fields of a frame may be coded separately (field- pictures). 
> Alternatively the two fields may be coded together as a frame 
> (frame-pictures). Both frame pictures and field pictures may be used in a 
> single video sequence.
> 
> In progressive sequences each picture in the sequence shall be a frame 
> picture. The sequence, at the output of the decoding process, consists of a 
> series of reconstructed frames that are separated in time by a frame period.
> 
> 6.1.1.2. Frame
> 
> A frame consists of three rectangular matrices of integers; a luminance 
> matrix (Y), and two chrominance matrices (Cb and Cr).
> 
> The relationship between these Y, Cb and Cr components and the primary 
> (analogue) Red, Green and Blue Signals (E’R , E’G and E’B ), the chromaticity 
> of these primaries and the transfer characteristics of the source frame may 
> be specified in the bitstream (or specified by some other means). This 
> information does not affect the decoding process.
> 
> 6.1.1.3. Field
> 
> A field consists of every other line of samples in the three rectangular 
> matrices of integers representing a frame.
> 
> A frame is the union of a top field and a bottom field. The top field is the 
> field that contains the top-most line of each of the three matrices. The 
> bottom field is the other one.
> 
> 6.1.1.4. Picture
> 
> A reconstructed picture is obtained by decoding a coded picture, i.e. a 
> picture header, the optional extensions immediately following it, and the 
> picture data. A coded picture may be a frame picture or a field picture. A 
> reconstructed picture is either a reconstructed frame (when decoding a frame 
> picture), or one field of a reconstructed frame (when decoding a field 
> picture).
> 
> 6.1.1.4.1. Field pictures
> 
> If field pictures are used then they shall occur in pairs (one top field 
> followed by one bottom field, or one bottom field followed by one top field) 
> and together constitute a coded frame. The two field pictures that comprise a 
> coded frame shall be encoded in the bitstream in the order in which they 
> shall occur at the output of the decoding process.
> 
> When the first picture of the coded frame is a P-field picture, then the 
> second picture of the coded frame shall also be a P- field picture. Similarly 
> when the first picture of the coded frame is a B-field picture the second 
> picture of the coded frame shall also be a B-field picture.
> 
> When the first picture of the coded frame is a I-field picture, then the 
> second picture of the frame shall be either an I-field picture or a P-field 
> picture. If the second picture is a P-field picture then certain restrictions 
> apply, see 7.6.3.5.
> 
> 6.1.1.4.2. Frame pictures
> 
> When coding interlaced sequences using frame pictures, the two fields of the 
> frame shall be interleaved with one another and then the entire frame is 
> coded as a single frame-picture.

So field pictures are decoded fields, and frame pictures are decoded frames? 
Not sure if I understand 100% but I think it’s pretty clear, “two field 
pictures comprise a coded frame.” IIRC field pictures aren’t decoded into 
separate fields because two frames in one packet makes something explode within 
FFmpeg

>>> But that's a visual projection of the decoded and rendered video, or if 
>>> you're encoding, it's what you want to see when you decode and render your 
>>> encoding. I think the term itself has a very abstract(?) nuance. The 
>>> picture seen at a certain presentation timestamp either has been decoded, 
>>> or can be encoded as frame pictures or field pictures.
> 
> You see. You are using the H.262 nomenclature. That's fine, and I'm 
> considering using it also even though it appears to be excessively wordy. 
> Basically, I prefer "pictures" for interlaced content and "halfpictures" for 
> deinterlaced content unweaved from a picture.
> 
>> Both are stored in "frames", a red herring in the terminology imo ...
> 
> Actually, it is frames that exist. Fields don't exist as discrete, unitary 
> structures in macroblocks in streams.
> 
>> ... The AVFrame that ffmpeg deals with isn't necessarily a "frame" as in a 
>> rectangular picture frame with width and height, but closer to how the data 
>> is  temporally "framed," e.g. in packets with header data, where one AVFrame 
>> has one video frame (picture). Image data could be scanned by macroblock, 
>> unless you are playing actual videotape.
> 
> You singing a sweet song, Ted. Frames actually do exist in streams and are 
> denoted by metadata. The data inside slices inside macroblocks I am calling 
> framesets. I firmly believe that every structure should have a unique name.
> 
>> So when interlace scanned fields are stored in frames, it's more than that 
>> both fields and frames are generalized into a single structure for both 
>> types of pictures called "frames" –  AVFrames, as the prefix might suggest, 
>> also are audio frames. And though it's not a very good analogy to 
>> field-based video, multiple channels of sound can be interleaved.
> 
> Interleave is not necessarily interlaced. For example, a TFF YCbCr420 
> frameset has 7 levels of interleave: YCbCr sample-quads, odd & even Y blocks 
> (#s 1,2,3,4), odd & even Y halfmacroblocks, TFF Y macroblock, TFF Cb420 block 
> (#5), TFF Cr420 block (#6), and macroblock. but only 3 interlacings: TFF Y 
> macroblock, TFF Cb420 block, and TFF Cr420 block.

Okay, horrible analogy then :). This is kind of what I was referring to, frames 
definitely do exist, but there are many different “frames,” including some with 
audio, and some with no media data, just signaling metadata. And there are also 
frames as in encoded video frames, I do think that both the progressive_frame 
bit in mpeg and  interlaced_frame in AVFrame both refer to this type of video 
frame, but they are flags set on structures that represent significantly 
different constructs that they shouldn’t be compared directly.

For example, imagine you picked out a random AVPacket from a mpeg stream, for 
example, what do you think the chances are of  that random packet having an 
AVFrame storing an image frame?

(Actually, probably very very high, but sometimes you will pick some with no 
video data at all.)

A more 1-to-1 association could be found in the mpeg encoder/decoder internal 
structures, but I don’t know how I could make use of that to extract flags on 
pictures that may have already been reconstructed.

https://www.ffmpeg.org/doxygen/trunk/structPicture.html

Regards,
Ted Park

P.S. Also, fields is another possibly confusing term, bitfields vs. top/bottom 
fields. 

_______________________________________________
ffmpeg-user mailing list
ffmpeg-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
ffmpeg-user-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-user] bwdif filter question

Reply via email to