opus chained streams metadata.

Romain Beauxis Fri, 14 Feb 2025 08:51:04 -0800

Le ven. 14 févr. 2025 à 10:18, Lynne <d...@lynne.ee> a écrit :
>
>
>
> On 14/02/2025 16:48, Romain Beauxis wrote:
> > Le jeu. 13 févr. 2025 à 17:37, Lynne <d...@lynne.ee> a écrit :
> >>
> >>
> >>
> >> On 14/02/2025 00:27, Romain Beauxis wrote:
> >>> Le jeu. 13 févr. 2025 à 15:53, Lynne <d...@lynne.ee> a écrit :
> >>>>
> >>>> On 10/02/2025 20:25, Romain Beauxis wrote:
> >>>>> These changes parse ogg/opus comment in secondary chained ogg/opus
> >>>>> streams and attach them as extradata to the corresponding packet.
> >>>>>
> >>>>> They can then be decoded in the opus decoder and attached to the next
> >>>>> decoded frame.
> >>>>>
> >>>>> libavformat/oggparseopus.c: Parse comments from
> >>>>>     secondary chained ogg/opus streams and pass them as ogg stream 
> >>>>> new_metadata.
> >>>>>
> >>>>> libavcodec/opus/dec.c: Unpack comments from secondary chained ogg/opus
> >>>>> streams and attach them to the next decoded frame.
> >>>>> ---
> >>>>>     libavcodec/opus/dec.c      | 25 ++++++++++++++++++++++---
> >>>>>     libavformat/oggparseopus.c | 16 +++++++++++++++-
> >>>>>     2 files changed, 37 insertions(+), 4 deletions(-)
> >>>>>
> >>>>> diff --git a/libavcodec/opus/dec.c b/libavcodec/opus/dec.c
> >>>>> index 88a650c81c..cddcefcb5f 100644
> >>>>> --- a/libavcodec/opus/dec.c
> >>>>> +++ b/libavcodec/opus/dec.c
> >>>>> @@ -125,6 +125,8 @@ typedef struct OpusContext {
> >>>>>         AVFloatDSPContext *fdsp;
> >>>>>         float   gain;
> >>>>>
> >>>>> +    AVDictionary *pending_metadata;
> >>>>> +
> >>>>>         OpusParseContext p;
> >>>>>     } OpusContext;
> >>>>>
> >>>>> @@ -485,12 +487,24 @@ static int opus_decode_packet(AVCodecContext 
> >>>>> *avctx, AVFrame *frame,
> >>>>>         int decoded_samples = INT_MAX;
> >>>>>         int delayed_samples = 0;
> >>>>>         int i, ret;
> >>>>> +    size_t size;
> >>>>> +    const uint8_t *side_metadata;
> >>>>>
> >>>>> -    if (buf_size > 8 && (
> >>>>> -       !memcmp(buf, "OpusHead", 8) ||
> >>>>> -       !memcmp(buf, "OpusTags", 8)))
> >>>>> +    if (buf_size > 8 && !memcmp(buf, "OpusHead", 8))
> >>>>>             return buf_size;
> >>>>>
> >>>>> +    if (buf_size > 8 && !memcmp(buf, "OpusTags", 8)) {
> >>>>> +        /* New metadata */
> >>>>> +        side_metadata = av_packet_get_side_data(avpkt, 
> >>>>> AV_PKT_DATA_METADATA_UPDATE, &size);
> >>>>> +        if (side_metadata) {
> >>>>> +            av_dict_free(&c->pending_metadata);
> >>>>> +            ret = av_packet_unpack_dictionary(side_metadata, size, 
> >>>>> &c->pending_metadata);
> >>>>> +            if (ret < 0)
> >>>>> +                return ret;
> >>>>> +        }
> >>>>> +        return buf_size;
> >>>>> +    }
> >>>>
> >>>> This is definitely not the right way to go about it. You're changing the
> >>>> packet layout too, which can potentially break existing users.
> >>>
> >>> Thanks for looking into this!
> >>>
> >>>> What you should do instead is to either create a linearized dictionary
> >>>> (e.g. <keylength><key><value_len><value> in a buffer), and pass that as
> >>>> extradata, or add a dictionary field to AVPacket, which would then be
> >>>> copied over to the frame after decoding, similar to how PTS, duration,
> >>>> etc. are carried over from packets.
> >>>
> >>> I'm confused: isn't it exactly what the ogg demuxer is already doing:
> >>>
> >>>       if (os->new_metadata) {
> >>>           ret = av_packet_add_side_data(pkt, AV_PKT_DATA_METADATA_UPDATE,
> >>>                                         os->new_metadata, 
> >>> os->new_metadata_size);
> >>>           if (ret < 0)
> >>>               return ret;
> >>>
> >>>           os->new_metadata      = NULL;
> >>>           os->new_metadata_size = 0;
> >>>       }
> >>
> >> OpusHead et. al. shouldn't appear in the data of AVPacket ever.
> >
> > These strings are part of the expected ogg packeting.
> >
> > The ogg spec mandates that the first packet is a header packet then
> > the following packets are comment packets.
> >
> > OpusHead and OpusTags are the headers indicating such packets. They
> > are part of the original format encapsulation, originally designed for
> > ogg streams.
>
> Ogg tags and data has exactly no business being in an Opus bitstream.
> Packets are supposed to contain pure codec data with no container
> features or data in them.
>
> > Part of the changes that I intend to do is to surface those packets in
> > the demuxer.
> >
> > In ogg/flac, typically, they are exported by the demuxer and visible
> > when operating at the packet level.
> >
> > With opus, because the header is parsed again, they are swallowed by
> > the demuxer and never surfaced.
> >
> > It is my understanding that, eventually, we want to be able to:
> > - open a demuxer on any ogg stream
> > - open a muxer for a new ogg stream
> > - pass all packets from demuxer to muxer
> >
> > and obtain a valid ogg stream output.
> >
> > There's still some work to do to achieve that with chained ogg streams
> > because of the current PTS/DTS reset but surfacing those packets
> > should be the first step.
>
> We *want* them to be swallowed by the demuxer.
> You're changing the packet structure by letting them through, breaking
> users. Even if you weren't, and this was a new codec, you're letting
> container data through in a codec packet.
>
> You do not need to let them through, since you can use the presence of
> side data to detect this.


I don't understand what you are suggesting here.

Just to make it clear: I am talking here about subsequent ogg header
packets in chained ogg streams, not the initial ogg packets.

Those packets are required by the spec for chained bitstreams:
https://xiph.org/ogg/doc/framing.html

How are you suggesting that a user could copy a chained ogg bitstream
at the packet level if the header packets of subsequent chained
bitstreams are not present?

Opus is the only demuxer that will not surface those packets because
of a single, codec-specific line of code:
https://code.ffmpeg.org/FFmpeg/FFmpeg/src/branch/master/libavformat/oggdec.c#L242-L244

<---->
    /* Chained files have extradata as a new packet */
    if (codec == &ff_opus_codec)
        os->header = -1;
<---->

For all other codecs, the header is not parsed again on subsequent
chained ogg bitstreams and the packets are passed down to the demuxer.

Here's an example output with vorbis. The logs are what's being
returned by the demuxer and decoder:

Stream ID: 0, codec name: vorbis, metadata: encoder=Lavc61.19.100 libvorbis
Stream ID: 0, packet PTS: 0, packet DTS: 0
Stream ID: 0, packet PTS: 128, packet DTS: 128
Stream ID: 0, frame PTS: 128, metadata: encoder=Lavc61.19.100 libvorbis
Stream ID: 0, packet PTS: 704, packet DTS: 704
Stream ID: 0, frame PTS: 704, metadata:
---> New packet headers here:
Stream ID: 0, packet PTS: 0, packet DTS: 0
Stream ID: 0, packet PTS: 0, packet DTS: 0
Stream ID: 0, packet PTS: 0, packet DTS: 0
Stream ID: 0, packet PTS: 0, packet DTS: 0
---> New decoded metadata:
Stream ID: 0, frame PTS: 0, metadata: encoder=Lavc61.19.100 libvorbis
Stream ID: 0, packet PTS: 128, packet DTS: 128
Stream ID: 0, frame PTS: 128, metadata:
Stream ID: 0, packet PTS: 704, packet DTS: 704
Stream ID: 0, frame PTS: 704, metadata:
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v4 3/6] Pass secondary ogg/opus chained streams metadata.

Reply via email to